Complex TimeSeries Analysis

download Complex TimeSeries Analysis

of 29

Transcript of Complex TimeSeries Analysis

  • 8/12/2019 Complex TimeSeries Analysis

    1/29

    KITUniversity of the State of Baden-Wuerttemberg and

    National Research Center of the Helmholtz Association

    KNOWLEDGE MANAGEMENT GROUP

    INSTITUTE OF APPLIED INFORMATICS AND FORMAL DESCRIPTION METHODS , FACULTY OF ECONOMICS AND BUSINESS ENGINEERING

    www.kit.edu

    Complex Time Series AnalysisBinh Luong

    Presentation of Diploma Thesis

    Supervisors: Prof. Dr. Rudi Studer (KIT)

    Dr. Christoph Lingenfelder and Dr. Boris Charpiot (IBM Deutschland)

    Dr. Achim Rettinger and Dipl.

    Inform. Benedikt Kmpgen (KIT)

  • 8/12/2019 Complex TimeSeries Analysis

    2/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods2 20-07-2012

    Overview

    IntroductionTime Series

    Seasonality

    Time Series Modeling TechniquesARIMA

    Exponential SmoothingProblems Analysis and Approaches

    Unstable Seasonal Pattern

    Multiple Seasonal Patterns

    Non Integer Periodicity

    Evaluation ResultsConclusions and Future Works

    Binh LuongComplex Time Series Analysis

  • 8/12/2019 Complex TimeSeries Analysis

    3/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods3 20-07-2012

    INTRODUCTION

    Complex Time Series Analysis

    Binh LuongComplex Time Series Analysis

  • 8/12/2019 Complex TimeSeries Analysis

    4/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods4 20-07-2012

    Evaluation Results

    Time Series - Definition

    Motivation: How to plan for the future?

    Need of tool to analyze past data and predict future data

    A time series (TS) is an ordered sequence ofnumeric values, observed at successive points of

    time.Time series are overall:Stock price

    Exchange rate, interest rate, inflation rate, national GDP

    Retail sales

    Electric power consumption

    Temperatures at a weather station

    Number of unemployment figures for a region

    Binh LuongComplex Time Series Analysis

    Problems Analysis and ApproachesIntroduction

    Conclusions and Future Works

  • 8/12/2019 Complex TimeSeries Analysis

    5/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods5 20-07-2012

    Evaluation Results

    Time Series - Components

    A TS is a combination of 4 components: trend, seasonal, cycle, error

    Binh LuongComplex Time Series Analysis

    Problems Analysis and ApproachesIntroduction

    Conclusions and Future Works

  • 8/12/2019 Complex TimeSeries Analysis

    6/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods6 20-07-2012

    Evaluation Results

    Seasonality in Time Series

    IBM Netezza Analytics (INZA) determines the seasonality

    period as follows:

    Run Fast Fourier Transformation

    Find peaks in the frequency diagram

    Calculate weight of each peak

    The nearest integer of the peak with the highest weight is set to be

    the periodicity of the time series

    Binh LuongComplex Time Series Analysis

    Problems Analysis and ApproachesIntroduction

    Conclusions and Future Works

    Kernel-Run Analysis: Detected seasonsSeason: 10.1 Weight: 0.67Season: 4.9 Weight: 0.47Detected periodicity = 10

  • 8/12/2019 Complex TimeSeries Analysis

    7/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods7 20-07-2012

    Evaluation Results

    Time Series Modeling Techniques

    The most two common TS modeling techniques are:

    ARIMA

    Exponential Smoothing

    ARIMA [1]:

    The forecast for a period is calculated as a weighted linearcombination of its own past values and past errors

    = + = Exponential Smoothing [2,3]:

    Each component of a time series (trend, seasonal, error) isrepresented as a weighted moving average of all past valueswith the weights decreasing exponentially

    + 1

    Binh LuongComplex Time Series Analysis

    Problems Analysis and ApproachesIntroduction Conclusions and Future Works

  • 8/12/2019 Complex TimeSeries Analysis

    8/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods8 20-07-2012

    PROBLEMS ANALYSIS AND

    APPROACHES

    Complex Time Series Analysis

    Binh LuongComplex Time Series Analysis

  • 8/12/2019 Complex TimeSeries Analysis

    9/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods9 20-07-2012

    Evaluation Results

    Overview of the problems

    In this work approaches are designed for 3 separate problems:

    Time series with unstable seasonal pattern

    Time series with non-integer periodicity

    Time series with multiple seasonal patterns

    Our approaches work as a pre-processing and post-processing

    steps to solve those issues. ARIMA and Exponential Smoothingare still applied to model and forecast time series.

    Formal description for each problem:

    Input:

    - A time series with an above-mentioned issue

    - Forecast horizon: a point of time in the future in which forecasts should bemade

    Output:

    - Forecasting results: a list of pairs of

    Binh LuongComplex Time Series Analysis

    Problems Analysis and ApproachesIntroduction Conclusions and Future Works

  • 8/12/2019 Complex TimeSeries Analysis

    10/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods10 20-07-2012

    Evaluation Results

    Issue 1: Unstable Seasonal Pattern

    In some cases, the seasonal pattern in a TS is not stable, i.e. the length of

    the periodicity varies over timeExample: monthly seasonal pattern in daily time series

    Binh LuongComplex Time Series Analysis

    Problems Analysis and ApproachesIntroduction Conclusions and Future Works

    0

    1000

    2000

    3000

    4000

    5000

    6000

    7000

    01.01.04 01.02.04 01.03.04 01.04.04 01.05.04 01.06.04 01.07.04 01.08.04 01.09.04 01.10.04

    ARIMA1/8 31/8 30/9

    Kernel-Run Analysis: Detected seasonsSeason: 30.428 Weight: 0.377501Season: 10.1519 Weight: 0.184886Season: 15.2192 Weight: 0.164569Detected periodicity = 30

  • 8/12/2019 Complex TimeSeries Analysis

    11/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods11 20-07-2012

    Evaluation Results

    Approach for Unstable Seasonal Pattern (1)

    Problem: the length of each period varies over time (e.g.monthly seasonal pattern with 29, 30 or 31 days / month)

    Approach:

    1. Transform each period in the original TS into newones based on a unique mean period length.

    2. Apply ARIMA or Exponential Smoothing forforecasting.

    3. At the end retransform the forecasting results basedon their real period lengths.

    Binh LuongComplex Time Series Analysis

    Problems Analysis and ApproachesIntroduction Conclusions and Future Works

  • 8/12/2019 Complex TimeSeries Analysis

    12/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods12 20-07-2012

    Evaluation Results

    Approach for Unstable Seasonal Pattern (2)

    Illustration:Transformation of a month from 31 days into 30 days

    All periods have a stable period length

    Binh LuongComplex Time Series Analysis

    Problems Analysis and ApproachesIntroduction Conclusions and Future Works

    0

    5

    10

    1520

    25

    30

    35

    40

    45

    1 2 3 4 5 6 7 8 910

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    0

    5

    10

    1520

    25

    30

    35

    40

    45

    1 2 3 4 5 6 7 8 910

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    Linear Splines Interpolation

  • 8/12/2019 Complex TimeSeries Analysis

    13/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods13 20-07-2012

    Evaluation Results

    Issue 2: Non Integer Periodicity

    When the spectral analysis finds a seasonal pattern whose length is

    not an integer

    its length will be rounded up.

    This problem causes inaccurate forecasted values.

    To illustrate the problem we can use a trigonometrical function:

    sin(2

    ). +

    For p=7.5 with Exponential Smoothing:

    Binh LuongComplex Time Series Analysis

    Problems Analysis and ApproachesIntroduction Conclusions and Future Works

    460

    470

    480

    490

    500

    510

    520

    530

    540

    1 16 31 46 61 76 91 106 121 136

    Kernel-Run Analysis: Detected seasonsSeason: 7.58903 Weight: 0.99993Detected periodicity = 8

  • 8/12/2019 Complex TimeSeries Analysis

    14/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods14 20-07-2012

    Evaluation Results

    Approach for Non-Integer Periodicity (1)

    Problem:

    - Although the TS has a non-integer periodicity ExponentialSmoothing can not realize that and just use the roundedperiodicity found by FFT for further analyzing.

    - ARIMA is not affected by this problem.

    Approach:

    1. Transform the original TS to a new one that has aninteger periodicity.

    2. Apply Exponential Smoothing for the new TS.3. Retransform the forecasting results using the non-integer

    periodicity at the beginning.

    Binh LuongComplex Time Series Analysis

    Problems Analysis and ApproachesIntroduction Conclusions and Future Works

  • 8/12/2019 Complex TimeSeries Analysis

    15/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods15 20-07-2012

    Evaluation Results

    Approach for Non-Integer Periodicity (2)

    Illustration:Transformation a TS with p=7.5 into p=8

    The new TS has now an integer periodicity p=8

    Binh LuongComplex Time Series Analysis

    Problems Analysis and ApproachesIntroduction Conclusions and Future Works

    Linear Splines Interpolation

    460

    470

    480

    490

    500

    510

    520

    530

    540

    1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

    460

    470

    480

    490

    500

    510

    520

    530

    540

    1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33

  • 8/12/2019 Complex TimeSeries Analysis

    16/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods16 20-07-2012

    Evaluation Results

    Issue 3: Multiple Seasonal Patterns

    Binh LuongComplex Time Series Analysis

    Some TS contain multiple seasonal patterns of different lengths

    To illustrate the problem we can use a trigonometrical function:

    sin 2 + sin2 . +

    For p1=9 and p2=15 with Exponential Smoothing:

    Problems Analysis and ApproachesIntroduction Conclusions and Future Works

    430

    450

    470

    490

    510

    530

    550

    570

    1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191 201 211

    Kernel-Run Analysis: Detected seasonsSeason: 8.91999 Weight: 0.599654Season: 15.0681 Weight: 0.400238Detected periodicity = 9

  • 8/12/2019 Complex TimeSeries Analysis

    17/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods17 20-07-2012

    Evaluation Results

    Approach for Multiple Seasonal Patterns (1)

    Problem:

    Exponential Smoothing can only handle one seasonalpattern. ARIMA provides quite good forecasts which still canbe improved.

    Approach:1. Remove all seasonal patterns iteratively one after another

    until there are no seasonal patterns in the TS.

    2. Apply ARIMA or Exponential Smoothing for the

    deseasonalized TS.3. Add all existing seasonal patterns into the forecasting

    results.

    Binh LuongComplex Time Series Analysis

    Problems Analysis and ApproachesIntroduction Conclusions and Future Works

  • 8/12/2019 Complex TimeSeries Analysis

    18/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods18 20-07-2012

    Evaluation Results

    Approach for Multiple Seasonal Patterns (2)

    Removing seasonal patterns iteratively:

    Binh LuongComplex Time Series Analysis

    Problems Analysis and ApproachesIntroduction Conclusions and Future Works

    430

    460

    490

    520

    550

    1 11 21 31 41 51 61 71 81 91

    Adjusted TS with 1 seasonalpattern p=9

    430

    460

    490

    520

    550

    1 11 21 31 41 51 61 71 81 91

    Original TS with 2 seasonalpatterns p1=9 and p2=15

    490

    495

    500

    505

    510

    1 11 21 31 41 51 61 71 81 91

    Adjusted TS with no seasonal pattern

    430

    460

    490

    520

    550

    1 11 21 31 41 51 61 71 81 91

    Adjusted TS with 1 seasonalpattern p=15

  • 8/12/2019 Complex TimeSeries Analysis

    19/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods19 20-07-2012

    EVALUATION RESULTS

    Complex Time Series Analysis

    Binh LuongComplex Time Series Analysis

  • 8/12/2019 Complex TimeSeries Analysis

    20/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods20 20-07-2012

    Evaluation Metrics

    Root Mean Square Error (RMSE) [4,5]

    1 ( )

    =

    Percentage Improvement

    100%

    Binh LuongComplex Time Series Analysis

    Problems Analysis and ApproachesIntroduction Conclusions and Future WorksEvaluation Results

  • 8/12/2019 Complex TimeSeries Analysis

    21/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods21 20-07-2012

    Unstable Seasonal Pattern

    ARIMA Exponential Smoothing

    Existing implementation vs. our approach

    0

    1000

    2000

    3000

    4000

    5000

    6000

    7000

    Binh LuongComplex Time Series Analysis

    0

    1000

    2000

    3000

    4000

    5000

    6000

    7000

    ARIMA Exponential SmoothingRMSEbefore 827,13 909,09RMSEafter 48,83 36,53Improvement 94,10% 95,98%

    Problems Analysis and ApproachesIntroduction Conclusions and Future WorksEvaluation Results

  • 8/12/2019 Complex TimeSeries Analysis

    22/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods22 20-07-2012

    Non-Integer Periodicity

    Before (Exponential Smoothing) After (Exponential Smoothing)

    Existing implementation vs. our approach

    Binh Luong

    Complex Time Series Analysis

    460

    470

    480

    490

    500

    510

    520

    530

    540

    1 16 31 46 61 76 91 106 121 136460

    470

    480

    490

    500

    510

    520

    530

    540

    1 16 31 46 61 76 91 106 121 136

    Exponential Smoothingp= 3,5 p=7,5 p=18,5 p=30,5

    RMSEbefore 34,36 23,74 15,13 7,88RMSEafter 5,2 1,21 0,21 0,08Improvement 41,69% 94,89% 98,58% 98,98%

    Problems Analysis and ApproachesIntroduction Conclusions and Future WorksEvaluation Results

  • 8/12/2019 Complex TimeSeries Analysis

    23/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods23 20-07-2012

    Multiple Seasonal Patterns

    Real data: hourly utility demand from a company in USA

    Existing implementation with ARIMA

    Our approach with ARIMA

    Binh Luong

    Complex Time Series Analysis

    0

    20004000

    6000

    8000

    10000

    12000

    14000

    16000

    1 201 401 601 801 1001 1201 1401 1601 1801 2001 2201 2401 2601 2801 3001

    0

    2000

    4000

    6000

    8000

    10000

    12000

    14000

    16000

    1 201 401 601 801 1001 1201 1401 1601 1801 2001 2201 2401 2601 2801 3001

    Problems Analysis and ApproachesIntroduction Conclusions and Future WorksEvaluation Results

    ARIMARMSEbefore 6535,4RMSEafter 1515,96Improvement 76,80%

  • 8/12/2019 Complex TimeSeries Analysis

    24/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods24 20-07-2012

    Multiple Seasonal Patterns

    Real data: hourly utility demand from a company in USA

    Existing implementation with Exponential Smoothing

    Our approach with Exponential Smoothing

    Binh Luong

    Complex Time Series Analysis

    0

    20004000

    6000

    8000

    10000

    12000

    14000

    16000

    1 201 401 601 801 1001 1201 1401 1601 1801 2001 2201 2401 2601 2801 3001

    0

    2000

    4000

    6000

    8000

    10000

    12000

    14000

    16000

    1 201 401 601 801 1001 1201 1401 1601 1801 2001 2201 2401 2601 2801 3001

    Problems Analysis and ApproachesIntroduction Conclusions and Future WorksEvaluation Results

    Exponential SmoothingRMSEbefore 1398,67RMSEafter 853,05Improvement 39,01%

  • 8/12/2019 Complex TimeSeries Analysis

    25/29

    Knowledge Management Group

    Institute of Applied Informatics and Formal Description Methods25 20-07-2012

    CONCLUSIONS AND

    FUTURE WORKS

    Complex Time Series Analysis

    Binh Luong

    Complex Time Series Analysis

  • 8/12/2019 Complex TimeSeries Analysis

    26/29

    Knowledge Management GroupInstitute of Applied Informatics and Formal Description Methods

    26 20-07-2012

    Conclusions

    Design solution approaches that can be combined with the

    existing modelling techniques (ARIMA or ExponentialSmoothing) to analyse time series with:

    Unstable seasonal pattern

    Non-Integer Periodicity

    Multiple Seasonal Patterns

    Prototype implementation.

    Evaluate our approaches and get a significant improvement offorecast accuracy compared to the existing implementation.

    Binh Luong

    Complex Time Series Analysis

    Problems Analysis and ApproachesIntroduction Conclusions and Future WorksEvaluation Results

  • 8/12/2019 Complex TimeSeries Analysis

    27/29

    Knowledge Management GroupInstitute of Applied Informatics and Formal Description Methods

    27 20-07-2012

    Future Works

    Unstable Seasonal Pattern

    Extend the algorithm to handle time series with numericaltime column

    Non-Integer Periodicity

    Distinguish between real non- integer periodicities and thosecaused by rounding error of the spectral analysis

    Multiple Seasonal Patterns

    Specify a reasonable threshold to filter out the seasonalpatterns that are also results of spectral analysis but do notpresent real seasonal variation in the time series

    Binh Luong

    Complex Time Series Analysis

    Problems Analysis and ApproachesIntroduction Conclusions and Future WorksEvaluation Results

    Time Column

    real date numeric

    Real non-integer vs. Rounding error

    found periodicity weight real periodicityYYN ?

  • 8/12/2019 Complex TimeSeries Analysis

    28/29

    Knowledge Management GroupInstitute of Applied Informatics and Formal Description Methods

    28 20-07-2012

    References

    [1] G.E.P Box and G.M. Jenkins, Time series analysis, forecasting

    and control., Holden-Day, San Francisco, 1970

    [2] E.S.Gardner, Jr, Exponential smoothing: the state of art, Journal of

    Forecasting 4 (1985)

    [3] E.S.Gardner, Jr, Exponential smoothing: the state of art part II,

    International Journal of Forecasting 22 (2006)

    [4] B. Abraham and J. Ledolter, Statistical methods for forecasting, John

    Wiley & Sons, New York, 1983

    [5] W. Reinmuth, W. Mendenhall, and R. J. Beaver, Statistics for

    management and economics, Duxbury Press, Belmonth, California,

    1993

    Binh Luong

    Complex Time Series Analysis

    Problems Analysis and ApproachesIntroduction Conclusions and Future WorksEvaluation Results

  • 8/12/2019 Complex TimeSeries Analysis

    29/29

    Knowledge Management Group29 20-07-2012

    Thank youfor your attention!

    Binh Luong

    Complex Time Series Analysis