[IEEE 2007 International Conference on Electrical Engineering - Lahore, Pakistan...

6
Long-Term Forecasting of Internet Traffic for Pakistan Internet Exchange Mehreen Alam Lahore University of Management Sciences, Lahore [email protected] Abstract-The paper makes future predictions of the Internet takes care of 91% of its variance. Weekly approximations of traffic demand by applying statistical techniques on data collected those components can be accurately modeled with low-order between two nodes of Pakistan's Internet backbone over a period of two years. The predictions simplify the task of capacity Ao-egress i tegrate Moving Average (ARIMA planning for the network management by helping them determine models. The reliability of the future predictions given by our when and to what extent future provisioning is required in the proposed model is proved empirically by the fact that the backbone. This is not easy as quantified information is missing estimates of the future values deviate by only 700 from the about the rate of increase and the pattern followed by the Internet actual values observed during that time. The method is traffic demand. We have used various statistical analysis methods on aggregate Internet demand between two nodes to isolate the practical in terms of the computation and storage but is robust long term trend from the noisy short term fluctuations in the only as long as the technology of the Internet service model is overall traffic pattern, ensure its variance is within control limits unchanged. and finally make a model out of it to make predictions for future. The present mode of Internet traffic modeling and The reliability of the future predictions given by our proposed forecasting is restricted between two nodes only which does model is proved empirically by the fact that the estimates of the future values deviate by only 7% from the actual values observed notgie us a consistet globaldstatusaof theewholenbackbone during that time. Future work would target the modeling and forecasting for the overall backbone capacity planning issues, which would also I. INTRODUCTION help in load balancing, handling the links that are temporarily The purpose of this paper is to do capacity planning of down and helps determine where a new node or a link is Internet traffic demand by applying statistical techniques on required. data collected between two nodes of Pakistan's Internet II. MEASUREMENT ENVIRONMENT backbone over a period of two years. We have used various statistical analysis methods on aggregate Internet demand A. Pakistan Internet Exchange between two nodes to isolate the long term trend from the Pakistan Internet Exchange (PIE) was created in 2000 to noisy short term fluctuations in the overall traffic pattern, cater for the needs of IP/ATM connectivity via a single core ensure its variance is within control limits and finally make a data backbone for the whole country. Figure 1 below shows the model out of it to make predictions for future. existing architecture of PIE network topology interconnecting The research methodology incorporates wavelet multi- major cites of Pakistan. All the links between the node of resolution analysis, analysis of variance and linear time series Karachi and Lahore (marked in red in the Figure 1) are models. Wavelet multi-resolution smoothes the collected monitored over the period of almost 2 years (Dec 2003 to Dec measurements until the overall long-term trend is identified 2005). New links were added and the old ones were replaced keeping intact the short term trends. The fluctuations around by technologically advanced links. The data in bits per second the trend are further analyzed at multiple time scales. It has is recorded at each link connecting the PoP Lahore with the been observed that the largest amount of variability in the PoP Karachi. Only the data exchanged on all links between original signal is due to its fluctuations at the 24 hour time these two nodes is taken into account. scale, which just confirms the diurnal pattern exhibited by the traffic demand. We show that for network provisioning purposes one needs only to account for the overall long term / HR trend as the fluctuations at smaller time scales have an/ insignificant contribution in the Internet traffic pattern across the backbone. We model the inter-Point of Presence (PoP) aggregate demand using two components: the long term trend KH and its variation. Inter-PoP aggregate demand is mapped to a f1I' multiple linear regression model with the above-mentioned two*W 132,tjj NT. RN- T B identified components. Using the Analysis of Variance - (ANOVA) techniques, it is shown that the proposed model captures 980% of the total energy of the original signal and Fig. 1. Architecture of Pakistan Internet Exchange.

Transcript of [IEEE 2007 International Conference on Electrical Engineering - Lahore, Pakistan...

Page 1: [IEEE 2007 International Conference on Electrical Engineering - Lahore, Pakistan (2007.04.11-2007.04.12)] 2007 International Conference on Electrical Engineering - Long-Term Forecasting

Long-Term Forecasting of Internet Traffic forPakistan Internet Exchange

Mehreen AlamLahore University of Management Sciences, Lahore

[email protected]

Abstract-The paper makes future predictions of the Internet takes care of 91% of its variance. Weekly approximations oftraffic demand by applying statistical techniques on data collected those components can be accurately modeled with low-orderbetween two nodes of Pakistan's Internet backbone over a periodof two years. The predictions simplify the task of capacity Ao-egress i tegrate Moving Average (ARIMAplanning for the network management by helping them determine models. The reliability of the future predictions given by ourwhen and to what extent future provisioning is required in the proposed model is proved empirically by the fact that thebackbone. This is not easy as quantified information is missing estimates of the future values deviate by only 700 from theabout the rate of increase and the pattern followed by the Internet actual values observed during that time. The method istraffic demand. We have used various statistical analysis methodson aggregate Internet demand between two nodes to isolate the practical in terms of the computation and storage but is robustlong term trend from the noisy short term fluctuations in the only as long as the technology of the Internet service model isoverall traffic pattern, ensure its variance is within control limits unchanged.and finally make a model out of it to make predictions for future. The present mode of Internet traffic modeling andThe reliability of the future predictions given by our proposed forecasting is restricted between two nodes only which doesmodel is proved empirically by the fact that the estimates of thefuture values deviate by only 7% from the actual values observed notgie us a consistet globaldstatusaof theewholenbackboneduring that time. Future work would target the modeling and forecasting for the

overall backbone capacity planning issues, which would alsoI. INTRODUCTION help in load balancing, handling the links that are temporarily

The purpose of this paper is to do capacity planning of down and helps determine where a new node or a link isInternet traffic demand by applying statistical techniques on required.data collected between two nodes of Pakistan's Internet II. MEASUREMENT ENVIRONMENTbackbone over a period of two years. We have used variousstatistical analysis methods on aggregate Internet demand A. Pakistan Internet Exchangebetween two nodes to isolate the long term trend from the Pakistan Internet Exchange (PIE) was created in 2000 tonoisy short term fluctuations in the overall traffic pattern, cater for the needs of IP/ATM connectivity via a single coreensure its variance is within control limits and finally make a data backbone for the whole country. Figure 1 below shows themodel out of it to make predictions for future. existing architecture of PIE network topology interconnectingThe research methodology incorporates wavelet multi- major cites of Pakistan. All the links between the node of

resolution analysis, analysis of variance and linear time series Karachi and Lahore (marked in red in the Figure 1) aremodels. Wavelet multi-resolution smoothes the collected monitored over the period of almost 2 years (Dec 2003 to Decmeasurements until the overall long-term trend is identified 2005). New links were added and the old ones were replacedkeeping intact the short term trends. The fluctuations around by technologically advanced links. The data in bits per secondthe trend are further analyzed at multiple time scales. It has is recorded at each link connecting the PoP Lahore with thebeen observed that the largest amount of variability in the PoP Karachi. Only the data exchanged on all links betweenoriginal signal is due to its fluctuations at the 24 hour time these two nodes is taken into account.scale, which just confirms the diurnal pattern exhibited by thetraffic demand. We show that for network provisioningpurposes one needs only to account for the overall long term / HRtrend as the fluctuations at smaller time scales have an/insignificant contribution in the Internet traffic pattern acrossthe backbone. We model the inter-Point of Presence (PoP)aggregate demand using two components: the long term trend KHand its variation. Inter-PoP aggregate demand is mapped to a f1I'multiple linear regression model with the above-mentioned two*W 132,tjj

NT. RN-T Bidentified components. Using the Analysis of Variance -(ANOVA) techniques, it is shown that the proposed modelcaptures 980% of the total energy of the original signal and Fig. 1. Architecture ofPakistan Internet Exchange.

Page 2: [IEEE 2007 International Conference on Electrical Engineering - Lahore, Pakistan (2007.04.11-2007.04.12)] 2007 International Conference on Electrical Engineering - Long-Term Forecasting

B. CollectedData originality of the data as it already has to undergo many levelsData was collected over a time period of approximately two of approximations in self aggregation, multi-time scale

years with data missing for a few months of first year and for a resolution and forecasting using time series analysis.few weeks of the second. Missing data could have been A week's data is included in the analysis only if theinterpolated but we chose to keep it unmodified to keep the complete data is available for the week from Monday tooriginality of the data as it has to undergo many levels of Sunday both inclusive. The order of the days is also important,approximations later on. Data was captured and saved in for example, data for a week Tuesday to Monday is notgraphical format using MRTG tool. It had to be converted to applicable as it does not capture the weekly trend followed byvector form using customized image processing techniques. weekdays and weekends.

C MRTG Data Keeping the target of looking at the long term, it wasMRTG is an acronym for Multi Router Traffic Grapher imperative to select graph of appropriate time granularity

which is a tool to monitor the traffic load on network links, amongst the graphs MRTG gives for each day, week, month

Normally, MRTG retrieves the statistics data using the SNMP, and year. Time granularity of each graph is 5 minutes, 30

records the values of the configuration, updates the statistics minutes, 2 hours and 1 day respectively. Out of all, the time. .. . .. . . . - ..' ~~~scale of 2 hours data was chosen as it iS refined enough tosaves the statistics in a graphical representation and finally . . .

incorporates graphs in html pages. The whole process is safeguard the daily variation and at the same time blurred

repeated typically every five minutes and one html page is enough to get rid of the high frequency components. Another

generated that features four graphs. Traffic workload on the m technique-oriented reason would be elaborated under the

node is plotted vs. time into day, week, month and year graphs. topic of multi resolution time analysis. Last but not least, data

Green area shows the Interet traffic input on the node while does not suffer from possible inaccuracies in the SNMPthe blue line marks the output traffic; we are only concered measurements, since such events are smoothed out by the

with the input traffic and restrict the analysis for the green averaging operation. (2 hours data is an averaged value for 5

region only for the time being. A sample MRTG graph is minute intervals).shown below. Data was available in a visual representation stored in

lossless compressed bitmap image format PNG (PortableD. Data Specifications Network Graphics). Digital image processing techniques had to

Data was collected from Dec 03 to Dec 05 and html pages be applied on the individual image files to extract data intogenerated at the frequency of 5 minutes for three links vectors.(namely: lhr-kh-2, lhr-kh-10 and lhr-kh-30) between the nodesof Lahore and Karachi. However, data was not available for all ata Trnsfrmtionthe time duration when either the link was down temporarily or Fomachin beteetwo ne (Lahore andvKarachi),the stored data was lost. Data for the link 'lhr-kh-2' is missing demandi regat to for one vector. Itacy olved infor 29 weeks for the year 2004. Data for both the links 'lhr-kh- t is prres toma theu and down sa of eachl2' and 'mhr-kh-10' has some days when the whole data was lost dto teorrespondn ti lieplee in mind th the

togeher rthlinswee donsearatly.data for the chosen is complete and belongs to thetogether ortheaonkkerep datawin wepartekly. fomwa o ep corresponding week number in month chosen. A quick checkinTactthe treaondr keexib dain week. form usageea to validate the correctness of the algorithm was to ensure thatintact the trend exhibited in a week. For example, usage at

telnt ftevco samlil fsvnweekends differs in weekdays; the former usually has lesser th.egho h etri utpeo sevenweekends differs in weekdays; the former .ul has....lesser The final transformation done on data before it is used forload. Such variations get smoothed out to an extent but stillllcontribute in forming a model for the Internet traffic demand amelig i th a tion.agrs of th points aremodel. Once a satisfactory model has been made, we just deal avera to a pointnwhic cangesit tais fromt2wit on dat vau fo eac week.......The prdce dat isls hours to 6 hours. Reason for applying this aggregationindicatedh soned value foreachweek. Thepredicteddataisalso becomes clear when data is smoothed out in wavelet multi-Missing dataouldvhaveboraeen. inerolte bt e hoe o resolution analysis (MRA). The wavelet MRA looks into theMissing data could have been interpolated but we chose to.. . .those 1 1 r. - roperties of the si nal at time scales 2 times coarser than theomit those data points altogether. Firstly almost half of the first prpeg

year's data was missing; inter n ifinest time scale and the collected measurements exhibit strongyear's data was missing; interpolating it WOUldI mean having a .substantial contribution of dummy data in analysis leading to periodicities at the cycle of 12 and 24 hours. We would have toinaccurate results. Secondly, it was wiser to keep the go to the granularity of 1.5 hours as the finest time scale or at 6

hours. Third and the fourth resolution of the former give us the108.0 M - granularity at 12 and 24 hours.z 7'~~~~~~~~~~~~~~~~~~~~~~ ........ ........ .. ....61 _k_01 3rd(23*815=12)jand4th(2 *1.5=24)j (1)

I~e Th Fri Sat Sun lio u Wed\hile for the 6 hours as the finest time scale it takes just the__________________________________ first and the second resolution to achieve the objective

Fig. 2. Sample MRGT Graph

Page 3: [IEEE 2007 International Conference on Electrical Engineering - Lahore, Pakistan (2007.04.11-2007.04.12)] 2007 International Conference on Electrical Engineering - Long-Term Forecasting

Ist ( 2' * 6 = 12) and 2nd ( 22 * 6 = 24) (2) k), and the mother wavelet function XV(t), xvjk(t) = 2j/2 (2-J t-k).The approximation is represented by a series of (scaling)

Both are ways of smoothening data out. Using the 6 hours coefficients ajk and the detail by a series of (wavelet)aggregated data, first two MRA are done in the form of coefficients djk.conventional averaging when compared to the 1.5 hour data. The reason for aggregating the data points into resolution of 6Analyzing the data at the smallest time scale of 1.5 hours or 30 hours makes sense now. Wavelet MRA is operated at the timeminutes is kept for further work. scales 2J times coarser than the finest time scale, where j is the

resolution index. As evident from table 1, decomposition at theIII. STATISTICAL ANALYSIS two hour granularity would devoid us of the inspecting the

Thorough observation of collected data shows that the data periodicity at 12 and 24 hours, while the granularity of 6 hourhas multi-scale variability, strong periodicity and is non- rightly captures them.stationary in nature. Periodicity with such attributes helps incoming up with future forecasts for the next cycle of the same wavelet MR Apich ation form MA we haveluse Haapro. Fo intne peidcte at the weekly cycle imply wavelet which iS a special form of Daubechies wavelets andperiod. For instance,periodicities atthewhas a single vanishing moment. Both these attributes of thethat the time series behavior from one week to the next can be g gwavelets are pertinent in the analysis of signals that have selfpredicted. similarity and require easy encoding of information data.Data obtained has abrupt spikes and ditches (Figure 3). The In figures 5 and 6, we show the approximation and detail up till

former could be caused by the surge of traffic during a specific 4 levels of the original Internet traffic at each time scale.time resolution caused due to the transfer of load from one link Resolution levels are for 12, 24, 48 and 96 hours, the last one

____________________________________________ at the coarsest level. We did not add another level of84. M

granularity as it would take the window of observation over a,: :~~~~~~~~~~. ... .... ..C~~~~~~~~~~~~~~~~~~... ...n .a..r.. .....63.0 M week only 96 hours makes up 4 days; getng coaser would

Nov ec an eb1ar~prlayJunJul~ugSepOctNov have the effects from 12 hours or 24 hours effect. These effectsFig. 3. Aggregated Internet Traffic Demand over the whole year have been plotted in the detailed signal of the decompositions.

to te oher outng hangs o simly enil ofserice To prove the sufficiency of the approximation signal withattacks. The ditch simply implies that the link is down for that i

long term trend anlalysis, these abrupt instantaneous changes i riilA rot ial

the data. Therefore, we smooth out these outliers by %lSaggregationandwavelets. ~~~~~~~~~~~~~Fig.4. Original Aggregated Signal

WaveletaoManmked seowavelets fnto pt ie h

mother wavelet) and scaling function (p(t) (also called father |

\,. .. ................

wavelet)~~~~~~~~~~~~~~~................inektheytimehdomain.Ateachabtimetscale,ithefrsignalkis

through a series of wavelet xvjk(t) and scaling funcihons puk(t), 4 d -t ar wouldwhere j is the resolution level and k is the translation index.eks. t l t 9 wou

RE T S e-wS t> tU~TABE I 1-ttSULTANT TIME SCALE FORTIME GRANULARITY OF 2 HOURS AND 6 HOURS t e f 1 .o 2 .Deostion LTevel DecomposedTime Scale o Fig. 5 Approximation Signals

attacks.IThe ditchsimply hour granularity 66hourgranularityh|

3rd ~ 23*2-16 2*6-48| 4th |24*2=32 1 2*6=96 11 :III11 S1

| 5th 2 *2=64 1=

6_2*2= 128 |---t- t; 4 ---- -

The scaling andwaveletfunctions are obtainedby dilating andf m E 1translating the father scaling function lp(t), pk(t) 2-J/2 p(2J t- Fig. 6. Detailed Signals

Page 4: [IEEE 2007 International Conference on Electrical Engineering - Lahore, Pakistan (2007.04.11-2007.04.12)] 2007 International Conference on Electrical Engineering - Long-Term Forecasting

respect to the decompositions done, we calculate the Capacity planning and finding long term trends require that thepercentage of the energy retained by the final approximation data is looked at from the broader window i.e. in terms ofsignal over the total energy of the signal. Percentage energy of weeks, rather than rely on variations of 24 hours over a week.each of the detail signal is also calculated. It is evident that For this, we find the mean of d2(t) (24 hr) for each week andoverall trend has a huge share of almost 96% even after the also find its standard deviation. This smoothes out the diurnaldecomposition. Amongst the detailed signals, d2 has the periodicity present in the data; and leads to lesser complexmaximum energy share, which corresponds to the fluctuations model for ARIMA (discussed in the section of time seriesacross 24 hours. Combining the approximation a4 and the analysis). Mean and the variance of the daily data trafficdetailed signal d2 with equal weight-age, captures the represent the fluctuations of the traffic around the long termmaximum energy of 97.9500 (95.8943 + 2.0555) of the original trend from day to day within each particular week. Therefore,aggregated signal. we get one value per week for the approximation signal,

denoted by l(t) and see the variation in terms of the detailedANOVA and Regression analysis: For model formulation, the signal denoted by dt2(t).combination of the approximation and the 24 hr time scale The ANOVA technique reinforces the conclusions reacheddetailed signal is used as it maximizes the energy of the signal above as the maximum cause of variation in the original signalwhile keeping the linear multiple regression model simple. We is caused by the approximation signal and the detailed signalhave used regression technique in coming up with a linear for the diurnal period. We proposed a model which explainsmodel and evaluated its degree of correctness using the approximately 91% of the variance in the original signal. Atanalysis of variance (ANOVA) technique. present we are only restricted to links between two nodes,Both ANOVA and the regression analysis study the impact of when it comes to a generic model for the over all backbone, itindependent variables on response variables i.e. variables is important to keep room for more variation. Keeping this(approximation and 2nd time scale signal) on the original strategy in mind, we propose the following model that fullysignal (dependent). But while ANOVA seeks to define the captures the variations in the Internet traffic demand.scope of the variables that will be included in an experiment,the regression analysis determines the coefficients for each x(t)= l(t) + 1.5 * dt2(t) (4)variable.Variance for the details signals d1(t), d3(t) and d4(t)contribute In the Figure 7, we show the aggregate traffic demand, the longless than 500 in the original, and so we choose d2 and a4 as the term trend in the data and, the two curves showing thefinal variables for the reduced model. approximation of the signal as the sum of the long term trend

within : times the average daily standard deviation for onex(t)= a4(t) + * d2(t) + e(t) (3) particular week. This model rightly covers all the data points

within the deviation of the weekly standard deviation. ShortNext we have used least squares method to evaluate the value term trends have almost vanished and the long term trendof J, which comes out to be 0.907 for the approximation signal followed by the traffic pattern is becoming apparent.and 0.298 for d2(t). Simplifying the model, we take coefficient Building on the foundations of this model, we project the trendfor the approximation signal to be 1 and 0.3 for the d2(t). to come up with forecast that helps in the capacity planning ofGoodness of the regression (Coefficient of Determination) P7 the backbone. The next section discusses how time seriesis quite significant (R2=0.912) which means that the modelaccounts for a large fraction of variability in the original signal.We also added d1(t) detailed signal and see how this affects the,R2 value, it does increase to 0.955 and the residual sum of |> 1squares gets halved, but the complexity enhanced by the lit -------------

additional variable outweighs the benefit of an accurate - Vestimate. So we restrict ourselves to a4(t) and d2(t) only.

Results and Deductions: Decomposition of the original Internet _traffic reveals existence of long term trend in the traffic. Most Fig. 7. Aggregate traffic demand against Monthsinfluential period that affects the input traffic is the diurnalpattern followed by the periodicity exhibited at 12 hour models help us in this projection and later compares the valuesinterval. The fluctuations around this long term trend are predicted with the actual observations under the testing period.mostly due to the significant changes in the traffic bandwidth G. Uni-VariateTimeSeriesAnalysisat the time scale of 24 hours. The combined effect of the The first requirement for an adequate model of Internet trafficapproximation signal and the diurnal variation accounts 98% of is that it must be stochastic and not deterministic. There arethe total energy in the original signal, which is a good many factors affecting the amount of traffic on the backbone,foundation to base our future predictions about the traffic most of which cannot be measured or identified. To predictdemand. probable future traffic, the best available basis is an analysis of

previously observed traffic patterns. Because we want to

Page 5: [IEEE 2007 International Conference on Electrical Engineering - Lahore, Pakistan (2007.04.11-2007.04.12)] 2007 International Conference on Electrical Engineering - Long-Term Forecasting

examine changes in the traffic over time, second requirement is The t-values for the ARIMA (1,1,1) are way safe below thethat model must be time series model. Furthermore, the critical values, strengthening our confidence in the model used.Internet traffic data is non-stationary, so the model must be of We compared our model with ARIMA: (1,0,0), (0,0,1), (1,1,0),the form that can accept non-stationary data. A time series (0,1,1), (2,0,0), (0,0,2), (2,1,2), (0,1,2) , (2,1,2). Against all themodel that fits these criteria is the autoregressive integrated models, ARIMA (1,1,1) gave the least value according tomoving average process (ARIMA). [2] AICC, BIC and log-likelihood objective function value and theThe ARIMA model is an extension of a set of time series smallest mean square prediction error [3]. The abovemodels called autoregressive (AR), moving average (MA) and mentioned criteria not only improve the validity of the modelautoregressive moving average (ARMA). An autoregressive but also fit the data parsimoniously by penalizing the modelsmodel of order p (denoted by AR(p)), predicts the current with large number of variables.value of a time series based on the weighted sum of p previousvalues of the process plus a random shock a. (A shock is a Models of Traffic Demand: The model proposed ARIMArandom drawing from a white noise process with zero mean (1,1,1) has been thoroughly diagnostically tested and comparedand finite variance). A moving average model of order q with the other prospective models too. Differencing at lag 1(denoted by MA(q)), predicts the current value based on a and a constant pt indicates that the long-term trend across allrandom shock a and weighted values of q previous a's. If these the traces is an exponential smoothing with growth, while themodels are combined, the ARMA model of order (p,q) predicts regressive part does not let the slope of the line to be equal to ptthe current value of the time series based on p previous values until its effect dies out as the predictions are extended fartherand q previous shocks. The advantage of the ARMA model is away in time. This also implies that the long-term forecasts arethat many stationary time series can be modeled with p and q ultimately following a sloping line with gradient equal to pi (i.e.values of 0, 1, or 2. 1.75 Mbps). This is related to the average aggregate demand

between the two nodes only. This increase of traffic demandFitting an ARIMA Model to Data: When fitting an ARIMA between the nodes can be used to calculate the cumulativemodel to time series data, there are three basic steps which are increase in the Internet traffic of the whole backbone.used iteratively until a successful model is achieved: The claim that the traffic demand between the two nodes

1) Model Identification: This is the determination of the increases at the rate of 1.75Mbps using ARIMA (1,1,1) haslikely values of p, d, and q for this set of data. Often been made possible because of the averaging done to aggregatethere will be several plausible models to be examined. the data at two hour's interval onto one point in week. This

2) Parameter Estimation: Once a set of possible models simplified the Box-Jenkins methodology by removing the threehas been selected, parameter (coefficient) values are seasonal components (12, 24 and 48 hours) and evening out thedetermined for each model. effect of outliers. Had the original time series been used, it

3) Diagnostic Checking: This involves both checking would have led to a highly unstable model, inaccurate forecastshow well the fitted model conforms to the data, and and at the same time very expensive computationally.also suggests how the model should be changed, in Forecasting done by the aggregated time series is evaluatedcase of a lack of good fit. Based on the outcome of the with the actual traffic in the following section.diagnostic test, p, d or q may be changed and steps 2 Evaluation of Forecasts: Figure 8 shows the actual trend lineand 3 are repeated. with positive/negative deviations as well as the predictedOnce a good fitting ARIMA model has been found by this traffic demand within 1.5 times the deviation from the actualmethod, it can be used to make forecasts of future behavior of

th sytm average value. The green vertical line demarcates the pointthe system. from where onwards the predictions are made. Empirically, as

well as visually, it is clear that the predictions made by ourTime SeriSS1 analsidovalongcalculaterm: ARiMAi (1, a used model fully encompass the traffic demand variation on weeklywith SPSS 11.5 and values calculated within 9500O confidencey basis. For capacity planning purposes, we would consider theintervals. Melard's algorithm was used for estimation while the X , X w

termination criteria is when the Maximum Marquardt constant upper bound on the aggregate demand as we want to have a

is 1.00E+09 and the number of iterations reach 10. The congestion-free backbone infrastructure.We find the percentage error incurred by our forecasts whenresultant equation iS as follows.

z(t) 1.66678 ± .07442 z(t-1) +.54446 a(t-1) ± a(t) (5).................

To check for the independence of the residuals, we calculatedthe auto-correlation function (acf) of the residuals and ...compared the critical values with the resultant t-values foundby using Bartlett's approximation for standard error ofestimated auto-correlations. Residuals pass the test for .a ;;;;.;;.;;. ...;..;...;;.1.E.l..... .'IIIII.IIiindependence if the t-values for lag 1, 2 and 3 are below 1.25 ,l Nwhile for the rest of the lags t-values have to be less then 1.6. Fig. 8. Forecasted Internet traffic demand against Months

Page 6: [IEEE 2007 International Conference on Electrical Engineering - Lahore, Pakistan (2007.04.11-2007.04.12)] 2007 International Conference on Electrical Engineering - Long-Term Forecasting

compared against the actual measurements for each week 24 hours and a week. Instead of tackling the seasonal patterns,during the evaluation period. Positive error refers to an we smoothed out the finer resolution periods and identified theoverestimation while the negative points to underestimated long term trends from the data. Daily variation had the majorfuture prediction, i.e. the observed demand were larger than the effect on the long term. So, a model for the Internet trafficpredicted. The accuracy of the predicted results can also be demand pattern was made that had the approximation signalseen from the fact that the error values are centered towards and a weighted contribution of the diurnal signal. This helpszero despite its fluctuations from time to time. On average, the get a model that maps to the real traffic as accurately aserror incurred is 500 for all the links between the two nodes. possible in a long term with minimal effects from shorterForecasts could have been made for a longer period; but to be seasonal patter and outliers. Weekly approximations are usedaccurate and to keep the error to its minimum it was reasonable to calculate forecasts using low order ARIMA process. Theto predict for a smaller restrictive period. We start facing forecasts at max deviated by 150 from the actual values, whilegreater variations from the actual values as we move farther

on. .

won average there was a deviation of 700O which iS a veryahead in the time span of forecasting. The effect gets morevisible when discrepancies across all the nodes/links optimistic measure at such large time resolutions.accumulate into a large error in the backbone. Even a larger Itewas the uel of agegion dcompu ostion andinput data set is unlikely to give predictions in the far future are averaging that helped in keeping the computation time withinclose to the actual values. The solution lies in re-estimation. milliseconds, making the practical implementation of theseWe propose to set a threshold for the maximum error calculations possible.encountered and revert back to the model fitting phase again ACKNOWLEDGMENTwhen this threshold is crossed. This would end up with a newadaptive, better fitting model which gives future values that are I am indebted to Dr Tariq Jadoon for helping me to getmore accurate than the previous model. insight into the traffic engineering issues and for his continued

IV. DiscussioN AND FUTURE WORK support and encouragement throughout the research process. Iexpress my gratitude to Dr Arif Zaman and Dr Sohaib A Khan

First target is to overcome the limitation of ARIMA model for helping me ascertain the right direction by reviewing thewhich bases its predictions solely on previous data. By nature, experimentation of different ideas. Thanks to Mr. AmirARIMA fails to take into account the impact of outside forces Mehmood for willingly lending the proprietary PTCL data forthat may fundamentally change the pattern of data. research purposes in time.Along with the forecasts at the week's level, we can extend REFERENCES

the analysis to a finer time scale to daily or 12 hourly. Thiswould give us the pattern from one day to the next, which [1] K. Papagiannaki, "Provisioning IP Backbone Networks Based on

Measurements". PhD thesis, University College London, March 2003.might not be of much help in capacity planning for the core [2] G. Box and G. Jenkins, Time Series Analysis: Forecasting and Control,routers, but might be useful for other network engineering Holden-Day, San Francisco, CA, 1970.tasks like scheduling of maintenance windows or large [3] P. Brockwell and R. Davis, Introduction to Time Series and Forecasting,database network backups. Springer, 1996.

[4] N. K. Groschwitz and G. C. Polyzos, "A Time Series Model of Long-Data at granularity of 2 hours was taken and further Term NSFNET Backbone Traffic," IEEE ICC'94,1994.

aggregated to a scale of 6 hours so as to get rid of the variation [5] s. Basu and A. Mukherjee, "Time Series Models for Internet Traffic,"24th Conf on Local Computer Networks, Oct. 1999, pp. 164-171.at 12 hours and 24 hours. A much finer way would be to take [6] J. Bolot and P. Hoschka, "Performance Engineering of the World Wide

data at the time scale of 1.5 hours and use MRA directly so that Web: Application to Dimensioning and Cache Design," 5th Internationalthe other contributing frequencies are not lost in simple, direct World Wide Web Conference, May 1996.

Another way to have a clearer picture of the traffic [7] K. Chandra, C. You, G. Olowoyeye, and C. Thompson, "Non-Linearaveraging. Another way to have a clearer p1cture of tne trall1c Time-Series Models of Ethernet Traffic," Tech. Rep., CACT, June 1998.pattern in the whole backbone is to look into the output traffic [8] R. A. Golding, "End-to-end performance prediction for the Internet,"data too which is shown by the blue line on the MRTG graph. Tech. Rep. UCSC-CRL-92-96, CISB, University of California, Santa

Cruz, June 1992.V. CONCLUSION [9] A. Sang and S. Li, "A Predictability Analysis of Network Traffic,"

INFOCOM, Tel Aviv, Israel, Mar. 2000.The paper focuses on identifying the long term trend [10] R.Wolski, "Dynamically Forecasting Network Performance Using the

between Lahore and Karachi, the two major nodes of Paksitan Network Weather Service," Journal ofCluster Computing, 1999.[ 11] A. Mehmood, T. Jadoon and N. Sheikh, "Evaluation of VoIP Quality

Internet Exchange. We came up with quantitative Internet over the Pakistan Internet Exchange (PIE) Backbone", IEEE-ICET, 2005.traffic demand forecasts for the next eight weeks using the past [12] I. Daubechies, "Ten Lectures on Wavelets," Cbms-Nsf Regionaldata spanning over 2 years. The predictions were compared to Conference Series in Applied Mathematics, 1992, vol. 61.d13] P. Brockwell and R. Davis, Introduction to Time Series and Forecasting,the actual demand observed and it was encouraging to observe Springer, 1996.minimal error between the estimates and the observed values.We aggregated the traffic demand for all the links between

the two nodes at the time granularity of two hours. Usingaggregation and multi-resolution analysis it was found thatthere were three seasonal periods present in the data: 12 hours,