SADC Course in Statistics Session 4B. The Principles of Official Statistics.
SADC Course in Statistics Time Series: An Introduction (Session 01)
-
Upload
rachel-lockhart -
Category
Documents
-
view
229 -
download
1
Transcript of SADC Course in Statistics Time Series: An Introduction (Session 01)
SADC Course in Statistics
Time Series:An Introduction
(Session 01)
2To put your footer here go to View > Header and Footer
Time Series Learning ObjectivesBy the end of the next 4 sessions, devoted totime series, you will be able to• appreciate the broader concept of data
where time is a factor• understand basic time series concepts and
terminology • be able to decompose a time series to look
at trends and seasonal effects, and do simple forms of forecasting
• be able to concisely summarize results of time series analysis in writing
3To put your footer here go to View > Header and Footer
Learning Objectives – this session
By the end of this session, you will be able to
• give examples of data collected over time
• state objectives of a time series analysis
• appreciate the importance of graphing data
• interpret key features emerging from an examination of a time series
• report main findings from a graphical presentation of time series data
4To put your footer here go to View > Header and Footer
Basics: Definitions and Notation• A time series is a collection of observations
made sequentially through time
• Such observations may be denoted by
Y1 , Y2 ,Y3 , … Yt , … , YT
observation at time t
since data are usually collected at discrete points in time
• The interval between observations can be any time interval (hours within days, days, weeks, months, years, etc).
5To put your footer here go to View > Header and Footer
Some areas of applications• Time series can occur in a wide range of
fields – from economics to sociology, meteorology to financial investment, etc
• Some examples of time series are:– Monthly closings of the stock exchange
index– Malaria incidence or deaths over calendar
years– Daily maximum temperatures– Hourly records of babies born at a
maternity hospital
• Can you suggest other examples?
6To put your footer here go to View > Header and Footer
Basics: Types of time series• Observations made continually in time give
rise to a Continuous Time Series, e.g.– Thermometer readings at a Met station
(continuously measured)– Measurement of whether air pollution reached
increasing levels of unacceptability at an industrial site (air pollution levels are continuous)
• More often, observations are taken only at specific points in time, giving rise to a Discrete Time Series, e.g.– annual number of road accidents (discrete)– maximum daily temperature (continuous)– whether or not there was daily rain (binary)
7To put your footer here go to View > Header and Footer
Objectives of a time series• Description (often with monitoring data)
– Merely to describe the patterns over time
• Explanation– Can the pattern observed over time be explained
in terms of other factors or causes? Helps in understanding the behaviour of the series
• Prediction (forecasting)– Can past records help us to predict what will
happen in the future?
• Improving the past system/behaviour– If factors affecting the behaviour of a variable
over time can be identified, action may be taken to improve the system, e.g. action over increasing levels of air pollution
8To put your footer here go to View > Header and Footer
Analysing Series with time element • Where the time element is just incidental, it
may not be necessary to use a formal time series analysis approach– e.g. start of the rainy season each year at a
tobacco farm
• The analysis used depends on the objective(s) of the study
• It can vary from just descriptive methods to more advanced analysis approaches
• In these time series sessions, we will largely concentrate on simple approaches.
9To put your footer here go to View > Header and Footer
Approach in this session
• We begin with some examples showing the importance of graphing the data to get an insight into the distribution over time
• For other examples, refer to 2.1.1 in
CAST for SADC – Higher Level
• We then summarise some lessons that can be learnt from graphing the data in time
10To put your footer here go to View > Header and Footer
Jumping to conclusions from raw dataData (interval-scale): Company profits (‘000 dollars)Objective: To study changes in profit figures over consecutive quarters
Year Quarter1
Quarter2
Quarter3
Quarter4
1 667 631 675 699
2 739 695 751 779
3 823 795 835 875
4 931 855 939 967
Impression is that the 4th quarter is always higher than the 1st quarter
11To put your footer here go to View > Header and Footer
Take a look again…
Previous impression is largely because there is a general increase over time
600
700
800
900
1000
1993 1994 1995 1996 1997 1998
Year
Pro
fits
(in
'000
$)
12To put your footer here go to View > Header and Footer
Objective: to emphasize the need for graphing distributions in order to get a clearer understanding of the data distribution
Day 1 Day 2 Day 3
Mean 20.81 20.81 20.81
Std Dev 0.72 0.72 0.72
Jumping to conclusions from summaries
Data source: Petruccelli, J; MSOR Connections Vol 7 No 2, 2007
Data (interval-scale): Breaking strengths of parcel string tested on a piece selected every 5 minutes from one spool during production. 100 samples from each of 3 different days (simulated data)
Summary statistics identical!
13To put your footer here go to View > Header and Footer
Take a look again…Day2
19
20
21
22
23
0 20 40 60 80 100
Consecutive samples
Bre
akin
g S
tren
gth
s
• The distributions are definitely different!
Day1
19
20
21
22
23
0 20 40 60 80 100
Consecutive samples
Bre
akin
g s
tren
gth
s
Day3
19
20
21
22
23
0 20 40 60 80 100
Consecutive samples
Bre
akin
g s
tren
gth
14To put your footer here go to View > Header and Footer
Discussion exercise • Level of data for analysis depends on
objectives– Level : time period
» Botswana hours of sunshine data
– Level: Local, National, International» Malaria incidence with rainfall pattern
relationship (between variables)» Malaria incidence comparisons
(between countries)
• In small groups, study the information on slides 15-20. Discuss what the graphs indicate and report back to the whole class after 20 minutes.
15To put your footer here go to View > Header and Footer
Zambia Rainfall Data
Problem: Farmers in Southern Zambia are moving out of the province because they believe that climate change is affecting farm production.
A local NGO promoting Conservation Farming insists that the problem is due to bad farming practice.
Study commissioned to investigate the problem; one of the events investigated was “Start of the Rains” ( defined as >20mm of rainfall in 3 days, after 15 November)
16To put your footer here go to View > Header and Footer
Start of RainsObjective: to investigate if there has been any change in start of the rainy season in Southern Zambia
Data source: Moorings Station, Monze, Southern Zambia Data (interval-scale): “Start of Rains” calculated as day number (from July 1st) of the first 3-day spell with >20 mm rain after November 15th
What is your answer to the question?
120
140
160
180
200
1920 1940 1960 1980 2000
Season
Da
y n
um
be
r w
ith
>2
0m
m r
ain
ov
er
3 d
ay
s
21To put your footer here go to View > Header and Footer
Lessons summarised• The level to which the data needs to be
summarised before analysis depends on the objective(s) of the study
• The specific analysis depends on the objectives - a descriptive analysis will often be sufficient
• Different levels of data will be needed depending on whether the problem is being looked at the international level, national level or local level
• Imperative however that quality data be made accessible to ensure that conclusions arising from the analysis are correct.
22To put your footer here go to View > Header and Footer
Time Plots• This is a plot of the measurement of interest
against the time of the observation
• No matter what you decide is the appropriate way to analyse your data, the time factor must not be ignored.
• As we have seen in the examples considered in this session, it is very important to start the exploration of a time series with a graphical representation of the data.
• However, there are a number of points to be kept in mind when drawing such a plot, as discussed in the next two slides
23To put your footer here go to View > Header and Footer
Choice of sampling interval
The two figures are of an ECG of a healthy woman , but whereas the bottom one is measured at a smaller interval, the top one is measured at a longer interval – and misses the peculiar peak of the heartbeat.
So the choice of the sampling interval is quite important: too frequent can be costly & too infrequent might miss out essential characteristics
24To put your footer here go to View > Header and Footer
Choice of aspect ratio
• Notice: different aspect ratios emphasize different characteristics of the series – the top one brings out the differences in the peaks while the lower one highlights the way the peaks rise and fall
25To put your footer here go to View > Header and Footer
To join or not to join
Same data as in slide 18 but without the points joined up
30
50
70
90
110
1975 1980 1985 1990 1995 2000
year
percent1
percent3
26To put your footer here go to View > Header and Footer
To join or not to join
Advantage of joining – usually easier to digest
Disadvantage – gives impression of continuity; definitely a risk when missing values exist
Return now to example on slide 15 for some practical work in order to ensure learning objectives are achieved…
(Details are outlined in Practical 01)
27To put your footer here go to View > Header and Footer
With reference to slide 19, note that: “The World Health Organisation does not warrant that the information contained in the web site is complete and correct and shall not be liable whatsoever for any damages incurred as a result of its use”.
The WHO website further add “Extracts of WHO information can be used for private study or for educational purposes without permission. Wider use requires permission to be obtained from WHO”.