Forecasting load-duration curves

15
Journal of Forecasting, Vol. 13, 545-559 (1994) Forecasting Load-duration Curves ANDREW BRUCE, SIMON JURKE AND PETER THOMSON Victoria University of Wellington, New Zealand ABSTRACT A new method is proposed for forecasting electricity load-duration curves. The approach first forecasts the load curve and then uses the resulting predictive densities to forecast the load-duration curve. A virtue of this procedure is that both load curves and load-duration curves can be predicted using the same model, and confidence intervals can be generated for both predictions. The procedure is applied to the problem of predicting New Zealand electricity consumption. A structural time-series model is used to forecast the load curve based on half-hourly data. The model is tailored to handle effects such as daylight savings, holidays and weekends, as well as trend, annual, weekly and daily cycles. Time-series methods, including Kalman filtering, smoothing and prediction, are used to fit the model and to achieve the desired forecasts of the load-duration curve. KEY WORDS Forecasting Load-duration curve Load curve Structural time-series models INTRODUCTION Accurate forecasting of electricity demand is vital for a variety of operational problems in power generation. These problems include optimal use of power plants to minimize expected costs, scheduling of maintenance, and short-term spot pricing. Key constructs to be forecasted are the demand for electricity over time, called the load curve, and the distribution of loads over a given period of time which is referred to as the load-duration curve. Forecasting the latter is the main objective of this paper. In combination with the distributions of power generated from the various plants, the load- duration curve can be used as input to non-linear optimization procedures which minimize the expected costs of meeting forecast demand. The load-duration curve also provides important measures of reliability such as the loss of load probability and the expected unserved demand. Further details concerning the load-duration curve and its uses can be found in Baleriaux et al. (1967), Booth (1972), EPRI (1982) and Gates (1985). In many electricity production costing systems, load-duration curves are commonly represented by Gram-Charlier or Edgeworth expansions involving the cumulants of the load distribution (see EPRI, 1982, for details). The cumulants are estimated from sample moments CCC 0277-6693/94/060545-15 Received July I992 0 1994 by John Wiley & Sons, Ltd. Revised May I994

Transcript of Forecasting load-duration curves

Page 1: Forecasting load-duration curves

Journal of Forecasting, Vol. 13, 545-559 (1994)

Forecasting Load-duration Curves

ANDREW BRUCE, SIMON JURKE AND PETER THOMSON Victoria University of Wellington, New Zealand

ABSTRACT A new method is proposed for forecasting electricity load-duration curves. The approach first forecasts the load curve and then uses the resulting predictive densities to forecast the load-duration curve. A virtue of this procedure is that both load curves and load-duration curves can be predicted using the same model, and confidence intervals can be generated for both predictions. The procedure is applied to the problem of predicting New Zealand electricity consumption. A structural time-series model is used to forecast the load curve based on half-hourly data. The model is tailored to handle effects such as daylight savings, holidays and weekends, as well as trend, annual, weekly and daily cycles. Time-series methods, including Kalman filtering, smoothing and prediction, are used to fit the model and to achieve the desired forecasts of the load-duration curve.

KEY WORDS Forecasting Load-duration curve Load curve Structural time-series models

INTRODUCTION

Accurate forecasting of electricity demand is vital for a variety of operational problems in power generation. These problems include optimal use of power plants to minimize expected costs, scheduling of maintenance, and short-term spot pricing. Key constructs to be forecasted are the demand for electricity over time, called the load curve, and the distribution of loads over a given period of time which is referred to as the load-duration curve. Forecasting the latter is the main objective of this paper.

In combination with the distributions of power generated from the various plants, the load- duration curve can be used as input to non-linear optimization procedures which minimize the expected costs of meeting forecast demand. The load-duration curve also provides important measures of reliability such as the loss of load probability and the expected unserved demand. Further details concerning the load-duration curve and its uses can be found in Baleriaux et al. (1967), Booth (1972), EPRI (1982) and Gates (1985).

In many electricity production costing systems, load-duration curves are commonly represented by Gram-Charlier or Edgeworth expansions involving the cumulants of the load distribution (see EPRI, 1982, for details). The cumulants are estimated from sample moments CCC 0277-6693/94/060545-15 Received July I992 0 1994 by John Wiley & Sons, Ltd. Revised May I994

Page 2: Forecasting load-duration curves

546 Journal of Forecasting Vol. 13, Iss. No. 6

obtained from historical data. This approach has the advantage that convolutions of the load- duration curve with stochastic supply are straightforward and computationally efficient. Other approaches have used piecewise linear functions or Fourier series to construct analytic representations of the load-duration curve. These methods typically work well when forecasting over periods of the order of quarters or years where the load-duration curve shows little variation. However, over periods of the order of a week, large changes are apparent due to weather, holidays and other effects. This increases the need to develop a formal prediction process to forecast load-duration curves over such time intervals.

By contrast, considerably more research effort has been expended on the forecasting of load curves. For a collection of papers detailing approaches to forecasting load curves see Bunn (1987). The paper by Harvey and Koopman (1993) uses load curve forecasting procedures and structural time-series models that are similar to the ones adopted here. A related and important problem is that of forecasting peak electricity demand over a period (peak hourly demand over a day, say). Further information on this particular aspect of electricity demand forecasting is given in Engle et al. (1992) and the references contained therein.

The strategy we adopt is based on first obtaining forecasts for the load curve over a given interval of time using a variant of a time series structural model (Harvey, 1989). These forecasts are then used to predict the load-duration curve using the minimum mean squared error predictor. This approach is conceptually appealing since it unifies the problems of forecasting the load curve and load-duration curve. It also yields a confidence interval for the load-duration curve which is not so readily available using other techniques. As will be seen below, this approach essentially approximates the distribution of loads by a weighted sum of (transformed) normal distributions. This strategy also has the virtue of being largely independent of the model and method adopted for forecasting the load curve.

We apply the strategy outlined above to the problem of forecasting load-duration curves in the important case of one-week-ahead forecasting of half-hourly electricity demand for New Zealand. The nature of the data is described in the next section. The general method for forecasting the load-duration curve is developed in the third section and an application of the forecasting technology is given in the final section.

EMPIRICAL LOAD-DURATION CURVES

The load curve represents electricity demand over time. Figure 1 displays the load curve on a half-hourly basis for all of New Zealand for the week Friday 1 April to Thursday 7 April 1988. This week includes Easter with 1 April being Good Friday. Note that the load is much higher during the day than at night, and the load peaks in the morning and early evening periods, reflecting the relatively high proportion of electricity usage by domestic consumers. The load is higher on weekdays as opposed to weekends and holidays and the shape of the load curve also changes depending on the day of the week. A discontinuity occurs at 1 1 pm due to the combined effects of a reduction in the price of power at this time and centrally switched domestic hot water supplies. This effect is more apparent in weekends and holidays when domestic usage dominates. Clearly, the load curve is highly structured and complex, requiring careful modelling to obtain accurate forecasts. In fact, several other features not evident in this plot must also be properly accounted for. In general, the load-duration curve represents the proportion of the total time that a load

x is exceeded in any given period. In this paper we restrict our attention to load-duration curves defined over a period of one week. In time periods as short as one week, there is

Page 3: Forecasting load-duration curves

A. Bruce, S. Jurke and P. Thomson Forecasting Load-duration Curves 547 '

I I I I

0 100 200 300

Ilalf-hours dnce the s l m of the week

Figure 1. Load curve for the week beginning Good Friday 1 April 1988

L hl u

?ooo 3000 4000 5000 ?ocO 301w 4IW.l So00 b a d 0 r.cod (YW)

Figure 2. Empirical load-duration curve (left-hand plot) and corresponding load density (right-hand plot) for 1-7 April 1988. The lower quartile (L), median (M) and upper quartile (U) have been indicated on both plots. The load-duration curve corresponds to a distribution that is both multimodal and platykurtic

Page 4: Forecasting load-duration curves

548 Journal of Forecasting Vol. 13, Iss. No. 6

significant variation in the load-duration curve due to weather, holidays, etc. For this reason, more sensitive forecasting procedures are required in comparison with those used for longer intervals such as quarters or years.

Let Xt be the aggregated electricity demand for New Zealand on a half-hourly basis. The empirical load-duration curve Pn(X) is defined as the proportion of half-hourly loads exceeding x in week n. Thus

where

A plot of the empirical load duration curve for the week of 1-7 April 1988 is given in Figure 2 together with a histogram representing the corresponding load density. The median together with the upper and lower quartiles have been indicated on both plots. Like the load curve on which it is based, the load-duration curve is highly complex and corresponds to a load distribution that is both multimodal and platykurtic.

LOAD-DURATION CURVE FORECASTING STRATEGY

A number of forecasting strategies are possible, including the explicit modelling of the load- duration curves or their corresponding densities (see EPRI, 1982, for example). The basic approach taken here is to first forecast the time series of total loads Xr. An estimate of the load-duration curve is then obtained from the predictive densities for Xi.

Given data up to and including week n - 1, the best mean square error predictor of the load duration curve for week n is given by

P(Xr 2 x I data) 1 Fn(x) = E(Pn(x) 1 data) = - 7 x 48 r inweekn

Here the operator E denotes expectation with respect to the distribution specified in the stochastic model adopted for Xr. Moreover, the mean squared prediction error is

Vn(X) = var(Pn(x) I data = ( - 7 c C Il.(s,t) (3) 7 X 48 s in week n r in week n

where

$(s, t ) = P ( X s 2 x, Xr 2 x I data) - P ( X , 2 x I data)P(Xr 2 x 1 data)

This can be used to generate pointwise confidence intervals for P n ( X ) . In order to obtain P n ( X ) and Vn(x) we need to compute P(Xr 2 X I data) and

P(Xs 2 x, Xr 2 x 1 data). The central role of the predictive densities is now clear since these probabilities are derived from the predictive distributions for Xr. Hence the problem of estimating Pn(X) and its variance has been reduced to the problem of obtaining reasonable predictive distributions for Xr. In this way the basic model structure identified for Xr underpins the load-duration curve forecasts and their properties in a completely natural and transparent way.

Page 5: Forecasting load-duration curves

A. Bruce, S . Jurke and P. Thomson Forecasting Load-duration Curves 549

Instead of modelling XI directly, we model g(Xf) as a Gaussian structural time series where g(x) is some suitable transformation. The class of power transformations defined by

xp if p>O

-xp i f p < 0 (4)

are a natural choice. In our particular case (see the following section) we adopted the logarithmic transformation g(x) = log x (x > 0), although other transformations such as the shifted logarithmic transformation g(x) = log(x - c ) (x > c ) also have merit given the sharp cut-off in the lower tail of the load density illustrated in Figure 2.

This implies that the predictive distribution for g(XI) is Gaussian and

P(XI Z X I data) = 1 - W g ( x ) - f i f ) /u r )

where a(*) is the cumulative distribution function of the standard normal. The quantities pI and a? are the conditional mean and variance

( 5 ) respectively. Note that Ff is the best predictor of g ( X f ) given the data, and a? is the prediction variance. These are readily obtained using standard time-series techniques (e.g. the Kalman filter; see Harvey, 1989).

A similar expression can be obtained for $(s, t ) . This expression involves the calculation of the distribution function for a bivariate normal. For s and t in week n it is readily seen that

PI = E(g(Xt) I data), a? = var(g(Xt) I data)

where psr is the correlation between g(X,) and g(XI) given the data. The function T ( - ) is defined as

W , k, P ) = ~ ( Z I Z h, Z2 2 k ) where Z I and Z2 have a standardized bivariate normal distribution with correlation p. This function can be calculated from the expansion

is the jth tetrachoric function (see Johnson and Kotz, 1972, for further details). There is a direct link between the tetrachoric functions and Hermite polynomials which can be utilized to give recursive relations that minimize computational cost. These recursions are given by

for j > 1. This strategy for forecasting the load-duration curve has three major advantages. First, it

is largely independent of the method adopted for forecasting the load curve. Second, it unifies

Page 6: Forecasting load-duration curves

550 Journal of Forecasting Vol. 13, Iss. No. 6

the problems of forecasting both the load curve and the load-duration curve. Third, the approach provides a straightforward formal mechanism for determining pointwise confidence intervals for the load-duration curve. Combined with a flexible forecasting methodology and a sensitive choice of transformation, this strategy should result in reasonable forecasts of the load-duration curve and its forecast error.

FORECASTING NEW ZEALAND ELECTRICITY CONSUMPTION

The results presented in this section are the product of a research project undertaken by the authors for the Electricity Corporation of New Zealand Ltd. The project was concerned with forecasting total New Zealand electricity consumption on a half-hourly basis and, in particular, with one-week-ahead forecasting of the load-duration curve. Full details of the analyses undertaken and the techniques used are given in Bruce et al. (1990, 1991). The data consisted of half-hourly measurements of total electricity consumption in megawatts (MW) over the seven-year period from 1 April 1982 to 31 March 1989. To evaluate forecast capability, the data from 14 October 1988 were withheld from the model-fitting and identification stage.

Forecasting the load curve The first step in the forecasting strategy is to predict the load curve for New Zealand electricity consumption. In this section we describe our load curve forecasting model and its performance.

Transformation choice We modelled the logarithmic transforms of the data rather than the original untransformed series. Thus, in terms of equation (4), p = 0 and g ( x ) = log x . This choice was based on a preliminary analysis of the data using the robust, non-parametric seasonal decomposition procedure SABL (see Cleveland et al., 1978), which selects the power transformation (4) that minimizes the interaction between the trend and cyclical components.

To identify an appropriate power transformation, SABL was applied to the original data and to the data averaged over a variety of time scales. The latter was done to ascertain the nature of the daily, weekly and annual cycles. In our case the various analyses favoured the logarithmic transformation which was subsequently adopted for the remainder of the study.

The logarithmic transform is an obvious and natural choice since it implies that the components of the model are multiplicatively related, as is the case with many other economic variables. Thus, as the New Zealand economy expands and the demand for electricity increases, systematic and non-systematic load fluctuations about the long-term trend will tend to increase proportionately.

Basic model The transformed data Yt = log Xr were modelled using a time-series structural model of the form

Yr = Lt + Rt (6) where the (slowly varying) daily level Lr and the daily residual R , are given by

Lr=Tr+Ar+ Wr+Cr Rr = D, + CI

(7)

Page 7: Forecasting load-duration curves

A. Bruce, S . Jurke and P. Thomson Forecasting Load-duration Curves 55 1

n . 0

3 -

I

0 '

a -

The components TI, At, Wr and Dt denote the trend, annual cycle, weekly cycle and daily cycle, respectively; ct represents uncorrelated white noise. A stationary error term C, was also included to model residual autocorrelation and, in particular, to pick up weather effects. All components in equations (6) and (7) are assumed independent.

Figure 3 shows the nature of the various components estimated using appropriate linear filters applied to Yr. A simple non-parametric estimate of Tt +At was obtained by applying a trianguIar moving-average smoother of length 671 to Yt, filtering out the weekly and daily cycles. The residual from this operation was then smoothed by a 95-point triangular moving average which filtered out the daily cycle to give an estimate of the Wr. In turn, the residual from this operation gave an estimate of Dt.

I

1983 I P U I986 1981 I966 1981

I 16 I I 7 I 16 I 17 I 16 31 16 I *F MV 1m M *U - 0 0

I 2 1 4 S 6 1 I V 10 ll 12 I1 I4 IS 16 I7 II I V Xo :I :Z 73 14 LI 26 21 X X )o I hn lul

Figure 3. Simple moving average estimates of the overall trend plus annual cycle, the weekly cycle for April-September 1988 and the daily cycle for June 1988 are given in the top, middle and bottom plots, respectively. The vertical bars on the right-hand sides of the plots are the same length in terms of the actual units, and show the relative importance of each component. The daily cycle and the trend plus annual cycle account for roughly the same amount of variability, while the weekly cycle accounts for the

least

Page 8: Forecasting load-duration curves

5 52 Journal of Forecasting Vol. 13, Zss. No. 6

The vertical bars plotted on the right-hand side of Figure 3 are the same length in terms of the actual units, showing the relative importance of each component. The daily cycle and the trend plus annual cycle account for roughly the same amount of variability. The weekly cycle is the least significant, although the level of consumption during the weekends is markedly lower than on weekdays.

Modelling strategy The model given by equations (6) and (7) involves an extremely wide diversity of time scales ranging from half-hourly, daily, and weekly through to annual. From a modelling perspective, it is very difficult to capture all these scales of variability within one relatively simple and sensible model of the forms (6) and (7). Furthermore, the modelling strategy chosen needs to be sensitive to the desired objective of forecasting one week ahead. These considerations led us to divide the analysis into two parts: a macro-scale analysis to identify and forecast the slowly varying daily level Lt and a micro-scale analysis to identify and forecast the level corrected component Rr.

For the macro analysis the half-hourly data was reduced to daily data by taking daily averages. From equation (6) this yields

F n = In + F n where L,, is the average of Lr over day n and the additive noise A is obtained from averaging Dr and cr over day n. Since Lr is defined to be slowly varying and smooth it should be well approximated by L n over day n . Thus, if A is approximately white, forecasts of r,, should provide reasonable forecasts of Lr. If, in addition, @,, is small relative to I n then approximate and conservative forecast intervals can be obtained for Lt based on forecasting the time series of daily averages r,,.

Similar considerations apply to the micro analysis which was based on the half-hourly data corrected for daily average level. Thus forecasts of Rr were obtained from the time series

Yr - Fn = Rr + Ot where the noise Of is Lr - E n - Fn. If 0, is approximately white and small relative to Rr then, once again, forecasts of Yr - F n should provide reasonable forecasts of R r . Moreover, approximate and conservative forecast intervals can be obtained for Rr based on forecasting the time series Yr - F n .

Finally, forecasts of Yr were obtained by summing the macro and micro forecasts of Lr and Rr, respectively. Note that this implicitly assumes that F n and Y t - F n are mutually uncorrelated which, as before, is approximately true if Fn and 6 are approximately white noise and small.

The assumptions made above follow from the requirement that Lt should be smooth, a least on a daily time scale, and that the daily cycle Dr evolves sufficiently slowly so that Dr sums approximately to zero over any day. Furthermore, daily averages of the noise component c, should be small relative to both L, and Rr. In our case these assumptions involve approximations which appear to be justified from a data analysis perspective.

In essence, Fn and Yr - Fn are being used as surrogates for L, and RI, respectively. Furthermore, the decomposition (6) was selected with a view to one-week-ahead forecasting. Other decompositions are also possible. To address these issues properly, a more flexible model framework is needed which can be tailored to a variety of forecasting objectives, and which can cope with the diversity of time scales involved, while retaining a relatively simple conceptual structure. This remains an important area for further research.

Page 9: Forecasting load-duration curves

A. Bruce, S. Jurke and P . Thomson Forecasting Load-duration Curves 553

Macro forecasting procedure Averaging over consecutive days in equation (7) yields

Y n = T n + A n + @n+Nn

where r,, A n and @n are averages of TI, At, and mt, respectively over day n . Here the corresponding daily averages of Ct and ct are subsumed into one autocorrelated error term Nn rather than modelling both components separately.

Initially, we fitted stochastic models to T n , A n and W n , obtaining maximum likelihood estimates of the variances that were close to zero for all three terms. Hence, given the focus on one-week-ahead forecasting, these components were all treated as deterministic and Fn reformulated as a time-series regression model.

Over the period of the data studied, a linear trend T,, sufficed. The annual cycle An was modelled as a sum of fixed sinusoidal terms plus a Christmas adjustment Z n so that

U 27rj A n = C [Aj cos Ajn+Af sin Ajn] + X n A,------

j = 1 - 365.25

Based on periodogram tests of detrended weekly averages, we chose a = 7 sinusoidal terms. The Christmas adjustment X n was included to handle the sharp drop at Christmas since a pure sinusoidal model would have led to an unacceptably high value of a. The form of the adjustment X n is given by

X n = Y O U - exp p(n2 - n) )x (n l < n < nz) + y lx (n = no)

where no = 24 December, nl = 25 December and nz = 17 January. The weekly cycle W n was modelled as five separate daily levels; one for each of Friday,

Saturday, Sunday and Monday, and one for the midweek days combined. Finally, the correlated noise Nn was modelled as a first-order autoregressive (AR) process.

The time-series regression model was fitted using Gaussian maximum likelihood and forecasts constructed. The model involved one autoregressive parameter and 22 regression parameters; two for the trend, 16 for the annual cycle, and four for the weekly cycle. Rather than estimating the shape parameter p of the Christmas correction, which requires a nonlinear estimation, it was fixed at 0.1 by trial and error.

Micro forecasting procedure Forecast intervals for the daily residual Rt were then approximated by forecast intervals for the mean corrected series Yt - Y n . To capture the dynamic behaviour of the daily cycle, the daily shapes were allowed to evolve slowly over the year. Three different dynamic daily shapes were used corresponding to the Saturday, Sunday and weekday cycles. The Friday and Monday daily cycles were obtained by fixed adjustments to the mid-week daily cycle. Also, a fixed adjustment to the shape of all days was required to handle the change to and from daylight savings. Finally, holidays and other ‘special’ days were modelled by classifying such days as another day of the week (usually a weekend day) as appropriate. For example, Good Friday was classified as a Sunday, and the preceding Thursday as a Friday.

Following Harvey (1989) and Walkington (1990), the dynamic daily cycles were modelled using slowly varying sinusoids as

where k € {w, Sa, Su) indexes the cycle shape (weekday, Saturday, or Sunday) and dk denotes

Page 10: Forecasting load-duration curves

554 Journal of Forecasting Vol. 13, Iss. No. 6

the number of frequencies 4, = 2?rj/48 required. The &,j( t ) satisfy the state space transition equation

E::r:3 = [sin +j cos +j Dk+,j(t- 1) k , j ( t )

where 6 k . j ( t) and 6;j (t) are mutually independent Gaussian white-noise processes with common variance ai(k,j). Here each daily cycle D k , r was constrained to evolve only over the days for which it applies. In keeping with other applications of this model and to minimize computational cost, we chose to constrain the ai(k,j) to be the same within each daily cycle. Moreover, the number of sinusoidal terms used was d k = 15 for all k. The latter was determined by identifying significant ordinates in the periodogram of the half-hourly data Yt - Fn and by examining the half-hourly residuals. For a general discussion on the nature and properties of these evolutionary cyclic models see Harvey (1989).

The Monday and Friday adjustments to the weekday cycle and the daylight savings adjustment were represented by sums of fixed sinusoids

cos 4j -sin 4j] [o*,j(t- I)] + [,,j(t)]

sk

S k , t = C (&,j cos 4jt + Sl,j sin 4jt) j = 1

where k E (Mo, Fr, DLS) indexes the cycle shape (Monday, Friday, or daylight savings). For modelling purposes, daylight savings was considered to start and end at 10 am Sunday. The number of sinusoidal terms used was S M ~ = SF^ = 5 and SDLS = 10; these were determined by analysis of the periodogram and the residuals.

The overall model for the daily residuals was

Yf - Fn = Dt + Et

where Et denotes uncorrelated white noise and the daily cycle D, is given by r:; + SM,J t E Monday t E [Tue, Wed, Thu]

t E Saturday t E Sunday

Dt = SDLS,tX(t E DLS) + Dw,t + S F ~ J t E Friday

The model was then cast in state space form and fitted by Gaussian maximum likelihood computed using the Kalman filter (see Harvey, 1989). This involved concentrating the 40 regression parameters and one variance parameter out of the likelihood, and then estimating the remaining three variance parameters by nonlinear optimization.

Combining the forecasts Forecasts of Yt were obtained by summing the individual forecasts of F n and Y t - F n . The forecast of the load curve for the transformed data for the week commencing Friday 14 October 1988 is given in Figure 4.

Performance of the forecasting models The residuals from the fitted macro model were approximately Gaussian, with slightly fatter tails (consistent with a Student’s t-distribution with roughly 12 degrees of freedom), and had a root mean square (RMS) proportionate error of approximately 1.8%. A small amount of autocorrelation remained in the residuals at lag 7. This reflected the inadequacy of the deterministic model used for the weekly cycle. In subsequent work, a stochastic evolutionary

Page 11: Forecasting load-duration curves

A . Bruce, S . Jurke and P. Thomson Forecasting Load-duration Curves

I

0 I

100 I

200 I

300

555

Half-hours SIIICC the start or tlic wcck

Figure 4. Forecasts of the load curve for the week beginning Friday 14 October I988 (dotted line), and the actual load curve for this week (solid line)

model for the weekly cycle has been incorporated, leading to a slight improvement in the daily level forecasts.

Similarly, the RMS proportionate error for the micro model was approximately 2.1'70, which is slightly larger than the RMS error for the macro model. The micro model performed worst when there was a significant change in temperature from previous days. The variability of the residuals was greatest during peak usage times. The residuals around 11 pm were abnormally large due to the jump in consumption at this time.

In Figure 4 the combined forecasts are compared to actual data which was withheld from the model fitting and identification stage. Over this week the forecasts have a RMS proportionate error of approximately 2.2'70, which is typical for the procedure except when there is a dramatic change in temperature during the forecast week. Preliminary evaluations therefore indicate that the load curve forecasting procedure consistently produces good forecasts over different periods of data, and is good enough to be incorporated into a production system.

Forecasting the load-duration curve The forecast load-duration curve and pointwise confidence limits were obtained using the approach described above. Since the forecast distributions for r,, and Y, - r,, are Gaussian, the mean pLI and variance u! are easily obtained from the regression model and the standard Kalman filter prediction formulae. Thus the predictive distributions of the loads Y, were readily obtained by combining the predictive distributions for F,, and Y , - r,, under the assumption that the latter are approximately independent. Note, however, that in order to compute the variance V,,((x) of the forecast load-duration curve it was necessary to compute

Page 12: Forecasting load-duration curves

556 Journal of Forecasting Vol. 13, Iss. No. 6

I

Ma I

300 I

400 I

sao

Load

Figure 5 . Forecast of the load-duration curve (dotted line) with 95% pointwise confidence interval (dashed lines) for the week beginning Friday 14 October 1988. The actual load-duration curve (solid line)

falls mostly within the estimated confidence interval

the correlations psr = corr( Y,, Y, 1 data). This was readily done through simple extensions to the Kalman filter.

Forecasts of the load-duration curve for the week commencing 14 October 1988 together with pointwise 95% confidence intervals are given in Figure 5 . The results are reasonable with the actual load-duration curve falling mostly within the estimated confidence interval. The 95% confidence limits look to be somewhat conservative, although they can, as in any multistep-ahead forecasting application, be tightened by decreasing the confidence level. Such bounds should prove to be of value in practice, especially when convolved with stochastic or uncertain supply, The assessment of uncertainty in load-duration curve forecasts is also important for planning purposes.

The procedure proposed here was compared to that routinely used by the Electricity Corporation of New Zealand Ltd. The latter procedure is a variant of a commonly used option within EGEAS (Electric Generation Expansion Analysis System) described in EPRI (1982). In particular, it computes the first eight moments of the distributions of half-hourly loads over past weeks and then predicts these moments for the week ahead. The load-duration curve is then reconstructed from these predicted moments using an eight-moment Gram-Charlier expansion. The relative performance of the two procedures is given in Figure 6 for the week beginning Monday 17 October. It can be seen that the eight-moment Gram-Charlier expansion does not capture the variability of the actual load-duration curve as well as the procedure proposed here.

Measures of the goodness of fit of these forecasts to the actual load-duration curve can be obtained using Mallows metrics. If F(x) and G ( x ) are continuous distribution functions with

Page 13: Forecasting load-duration curves

A. Bruce, S. Jurke and P. Thomson Forecasting Load-duration Curves 557

I I I I I I 200 250 300 350 400 450

Load

Figure 6. Forecast of the Ioad-duration curve (dotted line) for the week beginning Monday 17 October 1988 together with the actual load-duration curve (solid line). An eight-moment Gram-Charlier load-

duration curve (dashed line) is plotted for comparison

I MAD RMS Proposed method 39 52 Gratn-Charlier method 1 59 71

Figure 7. Measures of the goodness of fit in megawatts of load-duration curve forecasts to the actual load-duration curve for the proposed method and the eight-moment Gram-Charlier expansion. Mallows metrics M,, are used with p = 1 and p = 2. These correspond to the mean absolute deviation (MAD) and the root-mean-square (RMS), respectively, of the differences between the inverses of the load-duration

curve forecasts and the actual load-duration curve

well defined inverses, then these define the distance between F(x) and G(x) as 1

0 M,(F, G ) = ( j 1 F-'(u) - G - ' ( u ) J p du

for p > 0. Note that p = 1 and p = 2 give the mean absolute deviation and the root-mean- square respectively of the differences F-'(u) - G-'(u). The results of comparing the actual load-duration curve with its two forecasts using M,(F, G) for p = 1 and p = 2 are given in Figure 7. On the basis of these indicative figures, it can be seen that the proposed procedure is a marked improvement over the existing Gram-Charlier procedure with forecast accuracy improved by approximately 30%.

Page 14: Forecasting load-duration curves

558 Journal of Forecasting Vol. 13, Zss. No. 6

CONCLUSIONS

The load-duration curve forecasting strategy advocated in this report appears to have considerable merit. It is largely independent of the forecasting models adopted and can produce smooth analytic load-duration curve forecasts which accurately mirror the actual load-duration curve. The analytic load-duration curves adopted represent the forecast load- duration curve by a weighted sum of transformed normal distribution functions and, as such, are eminently amenable to convolution with other random variables such as supply. Moreover, pointwise confidence limits for the load-duration curves are readily established. Finally, the forecast load-duration curve is closely linked to the forecast load curve in a natural and intuitive manner.

However, the procedure is clearly dependent on sufficiently accurate forecasts of the load curve. In particular, care must be taken in practice to properly account for holiday, weekend and daylight-saving effects, and to carefully select a data transformation that best induces (approximate) normality. In the application presented here, the forecasting model is accurate enough to be incorporated into production usage.

In co-operation with the Electricity Corporation of New Zealand Ltd, we are continuing to improve the load curve forecasting model. In particular, we hope to account for the effects of weather in the micro forecasting procedure. In addition, we are examining the basic assumptions of the macro and micro models, with the long-term aim of replacing them with a single unified model.

Finally, note that the procedure provides pointwise rather than simultaneous confidence limits for the load-duration curve. The latter are considerably more difficult to obtain, although the work of Ravishanker et al. (1991) may prove helpful in this regard.

ACKNOWLEDGEMENTS

This research was financially supported by the Electricity Corporation of New Zealand Ltd. We are grateful to the Corporation for allowing us to use their data in this paper. In particular, we would like to expressly thank Dr Jonathan Lermit of that organization for presenting us with the problem, for providing the load-duration curve forecast given in Figure 6, and for providing us with much valuable guidance.

REFERENCES

Baleriaux, H. , Jamoulle, E. and de Guertechin, Fr. L., ‘Simulation de I’exploitation d’un parc de machines thermiques de production d’electricite couple A des stations de pomage’, Revue E, SOC. Belge des Electriciens, 5 (1967), 3-24.

Booth, R. R. , ‘Power system simulation model based on probability analysis’, IEEE Trans. Power Apparatus and Systems, PAS-91 (1972), 70-71.

Bruce, A. G., Jurke, S. R. and Thomson, P. J., ‘Forecasting load duration curves’, report prepared for the Electricity Corporation of New Zealand Ltd (1990).

Bruce, A. G., Jurke, S. R. and Thomson, P. J., ‘Forecasting load duration curves: Stage 2’, report prepared for the Electricity Corporation of New Zealand Ltd (1991).

Bunn, D. (ed.), ‘Approaches to electric load forecasting’, special section in Journal of Forecasting, 6

Cleveland, W. S. and Devlin. S. J . , ‘Calendar effects in monthly time series: modelling and adjustment’, (1987), 91-156.

Journal of the American Statistical Association, 11 (1982), 520-528.

Page 15: Forecasting load-duration curves

A. Bruce, S . Jurke and P. Thornson Forecasting Load-duration Curves 559

Cleveland, W. S., Dunn, D. M. and Terpenning, I . J., ‘SABL-a resistant seasonal adjustment procedure with graphical methods for interpretation and diagnosis’, in Zellner, A. (ed.), Seasonal Analysis of Economic Time Series, 201-231, Washington, DC: US Department of Commerce, Bureau of the Census, 1978.

Engle, R. F., Mustafa, C. and Rice, J., ‘Modelling peak electricity demand’, Journal of Forecasting, 11

EPRI, Electric Generation Expansion Analysis System, EL-2561, final report of Project 1529-1 prepared by Massachusetts Institute of Technology, Cambridge, Massachusetts, 1982.

Gates, D. J., ‘On the optimal composition of electricity grids with unreliable units: solvable models’, Adv. Appl. Prob., 17 (1985), 367-385.

Harvey, A. C., Forecasting, structural time series models and the Kalman jl ter, Cambridge University Press, 1989.

Harvey, A. C. and Koopman, S. J . , ‘Forecasting hourly electricity demand using time-varying splines’, Journal of the American Statistical Association, 88 (1993), 1228-1236.

Johnson, N. L. and Kotz, S., Distributions in Statistics: Continuous Multivariate Distributions, New York: John Wiley, 1972.

Ravishanker, N., Shiao-Yen Wu, L. and Glaz, J., ‘Multiple prediction intervals for time series: comparison of simultaneous and marginai intervals’, Journal of Forecasting, 10 (1991), 445-463.

Walkington, M. T., Forecasting electricity consumption with structural time series models, MSc thesis, Victoria University of Wellington, New Zealand, 1990.

(1992), 241-251.

Authors’ biographies: Andrew Bruce completed his A.B. degree in statistics at Princeton University, in 1980 and his Ph.D. in statistics at the University of Washington, Seattle, in 1988. He is currently a research scientist at the StatSci division of MathSoft, Inc. His main research interests are in the areas of wavelets, time series analysis, seasonal adjustment, signal processing, and scientific computing. Simon Jurke completed his B.Sc.(Hons) degree in 1989 and his M.Sc. degree in 1991, both in statistics and operations research at Victoria University of Wellington, New Zealand. He is currently an Operations Research Analyst for CORE Management Systems Ltd, Wellington, New Zealand. His main interests are in simulation, forecasting and statistical programming. Peter Thornson completed his B.Sc.(Hons) degree in mathematics at Otago University, New Zealand, in 1968 and his Ph.D. in statistics at the Australian National University, Canberra, in 1972. He is currently a Reader in Statistics at the Institute of Statistics and Operations Research, Victoria University of Wellington, New Zealand. His main research interests are in the general area of time series including forecasting, seasonal adjustment, signal processing, delay estimation and irregular sampling. This research typically has an applied focus with application to economic, financial, geophysical, meteorological and oceanographic time series.

Authors’ address: Andrew Bruce, Simon Jurke and Peter Thomson, Institute of Statistics and Operations Research, Victoria University of Wellington, P.O. Box 600, Wellington, New Zealand.