Observability foundations in dynamically evolving architectures
Dynamically Evolving Systems: Cluster Analysis Using Time
-
Upload
magnify-analytic-solutions -
Category
Documents
-
view
541 -
download
0
description
Transcript of Dynamically Evolving Systems: Cluster Analysis Using Time
![Page 1: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/1.jpg)
SAS Global ForumSAS Global ForumApril 24, 2012April 24, 2012
Dynamically Evolving Dynamically Evolving Systems:Systems:
Cluster Analysis Using TimeCluster Analysis Using Time
David J. Corliss, PhDDavid J. Corliss, PhD
Magnify Analytic SolutionsMagnify Analytic SolutionsWINNER
BEST CONTRIBUTED STATS PAPER
2012 SAS GLOBAL FORUM
WINNER BEST
CONTRIBUTED STATS PAPER
2012 SAS GLOBAL FORUM
![Page 2: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/2.jpg)
OUTLINE
Overview and HistoryOverview and History
A Basic ExampleA Basic Example
Plotting the ResultsPlotting the Results
StandardizationStandardizationNon-Periodic Non-Periodic
EventsEventsSummarySummary
![Page 3: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/3.jpg)
A Typical Cluster Analysis in SAS
![Page 4: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/4.jpg)
title2 'Preliminary Analysis by FASTCLUS'; proc fastclus data=iris summary maxc=10 maxiter=99 converge=0 mean=mean out=prelim cluster=preclus; var petal: sepal:; run;
An Example of PROC FASTCLUS
![Page 5: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/5.jpg)
title2 'Preliminary Analysis by FASTCLUS'; proc fastclus data=iris summary maxc=10 maxiter=99 converge=0 mean=mean out=prelim cluster=preclus; var petal: sepal:; run;
An Example of PROC FASTCLUS
![Page 6: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/6.jpg)
title2 'Preliminary Analysis by FASTCLUS'; proc fastclus data=iris summary maxc=10 maxiter=99 converge=0 mean=mean out=prelim cluster=preclus; var petal: sepal:; run;
An Example of PROC FASTCLUS
![Page 7: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/7.jpg)
title2 'Preliminary Analysis by FASTCLUS'; proc fastclus data=iris summary maxc=10 maxiter=99 converge=0 mean=mean out=prelim cluster=preclus; var petal: sepal:; run;
An Example of PROC FASTCLUS
![Page 8: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/8.jpg)
title2 'Preliminary Analysis by FASTCLUS'; proc fastclus data=iris summary maxc=10 maxiter=99 converge=0 mean=mean out=prelim cluster=preclus; var petal: sepal:; run;
An Example of PROC FASTCLUS
![Page 9: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/9.jpg)
title2 'Preliminary Analysis by FASTCLUS'; proc fastclus data=iris summary maxc=10 maxiter=99 converge=0 mean=mean out=prelim cluster=preclus; var petal: sepal:; run;
An Example of PROC FASTCLUS
![Page 10: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/10.jpg)
An Example of PROC FASTCLUS
![Page 11: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/11.jpg)
Cluster Analysis AppliedCluster Analysis Appliedto Time Series Datato Time Series Data
**** NOAA Precipitation Data ****; data work.noaa; infile "/home/sas/NESUG/noaa_mi_1950_2009_tab.txt" dsd dlm='09'x lrecl=1500 truncover firstobs=2; input state_code :3.0 division :3.0 year_month :$6. pcp :6.2 ; length year 8.0 month 8.0; year = left(year_month,1,4); month = right(year_month,5,2); run;
![Page 12: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/12.jpg)
Plotting the ResultsPlotting the Results
goptions device=png;symbol1 font=marker value=u height=0.6 c=blue;symbol2 font=marker value=u height=0.6 c=red;symbol3 font=marker value=u height=0.6 c=yellow;symbol4 font=marker value=u height=0.6 c=green;legend1 frame cframe=ligr label=none cborder=black position=center value=(justify=center);axis1 label=(angle=90 rotate=0) minor=none;axis2 minor=none; proc gplot data=work.cluster3; plot year * month = cluster /frame cframe=ligr legend=legend1 vaxis=axis1 haxis=axis2;run;
![Page 13: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/13.jpg)
Plotting the ResultsPlotting the Results
goptions device=png;symbol1 font=marker value=u height=0.6 c=blue;symbol2 font=marker value=u height=0.6 c=red;symbol3 font=marker value=u height=0.6 c=yellow;symbol4 font=marker value=u height=0.6 c=green;legend1 frame cframe=ligr label=none cborder=black position=center value=(justify=center);axis1 label=(angle=90 rotate=0) minor=none;axis2 minor=none; proc gplot data=work.cluster3; plot year * month = cluster /frame cframe=ligr legend=legend1 vaxis=axis1 haxis=axis2;run;
![Page 14: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/14.jpg)
Plotting the ResultsPlotting the Results
goptions device=png;symbol1 font=marker value=u height=0.6 c=blue;symbol2 font=marker value=u height=0.6 c=red;symbol3 font=marker value=u height=0.6 c=yellow;symbol4 font=marker value=u height=0.6 c=green;legend1 frame cframe=ligr label=none cborder=black position=center value=(justify=center);axis1 label=(angle=90 rotate=0) minor=none;axis2 minor=none; proc gplot data=work.cluster3; plot year * month = cluster /frame cframe=ligr legend=legend1 vaxis=axis1 haxis=axis2;run;
![Page 15: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/15.jpg)
Plotting the ResultsPlotting the Results
goptions device=png;symbol1 font=marker value=u height=0.6 c=blue;symbol2 font=marker value=u height=0.6 c=red;symbol3 font=marker value=u height=0.6 c=yellow;symbol4 font=marker value=u height=0.6 c=green;legend1 frame cframe=ligr label=none cborder=black position=center value=(justify=center);axis1 label=(angle=90 rotate=0) minor=none;axis2 minor=none; proc gplot data=work.cluster3; plot year * month = cluster /frame cframe=ligr legend=legend1 vaxis=axis1 haxis=axis2;run;
![Page 16: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/16.jpg)
Cluster Analysis AppliedCluster Analysis Appliedto Time Series Datato Time Series Data
![Page 17: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/17.jpg)
Rate of Change VariablesRate of Change Variables**** interval percent change ****; proc sort data=work.doe; by year week;run; data work.doe; set work.doe; by year week; retain pw_price_index; output; if first.week then pw_price_index = annualized_price_index;run; data work.doe; set work.doe; by year week; weekly_pct_change = (annualized_price_index - pw_price_index) * 100;run;
![Page 18: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/18.jpg)
Rate of Change VariablesRate of Change Variables**** interval percent change ****; proc sort data=work.doe; by year week;run; data work.doe; set work.doe; by year week; retain pw_price_index; output; if first.week then pw_price_index = annualized_price_index;run; data work.doe; set work.doe; by year week; weekly_pct_change = (annualized_price_index - pw_price_index) * 100;run;
![Page 19: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/19.jpg)
Rate of Change VariablesRate of Change Variables**** interval percent change ****; proc sort data=work.doe; by year week;run; data work.doe; set work.doe; by year week; retain pw_price_index; output; if first.week then pw_price_index = annualized_price_index;run; data work.doe; set work.doe; by year week; weekly_pct_change = (annualized_price_index - pw_price_index) * 100;run;
![Page 20: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/20.jpg)
Rate of Change VariablesRate of Change Variables**** interval percent change ****; proc sort data=work.doe; by year week;run; data work.doe; set work.doe; by year week; retain pw_price_index; output; if first.week then pw_price_index = annualized_price_index;run; data work.doe; set work.doe; by year week; weekly_pct_change = (annualized_price_index - pw_price_index) * 100;run;
![Page 21: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/21.jpg)
Standardization of VariablesStandardization of Variables
proc standard data=work.doe mean=0 std=1 out=work.doe_stan; var week annualized_price_index weekly_pct_change supply supply_pct_change;run; proc fastclus data=work.doe_stan maxc=6 maxiter=20 out=work.cluster1; var week annualized_price_index weekly_pct_change supply supply_pct_change; run;
![Page 22: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/22.jpg)
Use of PROC STANDARDUse of PROC STANDARD
proc standard data=work.doe mean=0 std=1 out=work.doe_stan; var week annualized_price_index weekly_pct_change supply supply_pct_change;run; proc fastclus data=work.doe_stan maxc=6 maxiter=20 out=work.cluster1; var week annualized_price_index weekly_pct_change supply supply_pct_change; run;
![Page 23: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/23.jpg)
Use of PROC STANDARDUse of PROC STANDARD
proc standard data=work.doe mean=0 std=1 out=work.doe_stan; var week annualized_price_index weekly_pct_change supply supply_pct_change;run; proc fastclus data=work.doe_stan maxc=6 maxiter=20 out=work.cluster1; var week annualized_price_index weekly_pct_change supply supply_pct_change; run;
![Page 24: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/24.jpg)
Standardization of Volatile DataStandardization of Volatile Data
![Page 25: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/25.jpg)
Identification of SeasonsIdentification of Seasons
Blue: Post-holiday lull
Purple: Winter Season
Red: Spring Run-upAmber: Summer Driving
SeasonYellow: Holiday Spikes
Green: Fall Season
![Page 26: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/26.jpg)
Non-Periodic EventsNon-Periodic Eventsof Variable Durationof Variable Duration
**** Time series of high velocity absorption events ****; data work.first_date; set work.hva; by event_id jd_minus_24e5; if first.event_id; first_date = jd_minus_24e5; keep event_id first_date;run; (and similarly for the last date of each event, which are merged with the first date) data work.hva; merge work.hva work.first_last; by event_id; percent_duration = (day_of_event / duration) * 100;run;
![Page 27: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/27.jpg)
One-Time EventsOne-Time Eventsof Variable Durationof Variable Duration
**** Time series of high velocity absorption events ****; data work.first_date; set work.hva; by event_id jd_minus_24e5; if first.event_id; first_date = jd_minus_24e5; keep event_id first_date;run; (and similarly for the last date of each event, which are merged with the first date) data work.hva; merge work.hva work.first_last; by event_id; percent_duration = (day_of_event / duration) * 100;run;
![Page 28: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/28.jpg)
Non-Periodic EventsNon-Periodic Eventsof Variable Durationof Variable Duration
**** Time series of high velocity absorption events ****; data work.first_date; set work.hva; by event_id jd_minus_24e5; if first.event_id; first_date = jd_minus_24e5; keep event_id first_date;run; (and similarly for the last date of each event, which are merged with the first date) data work.hva; merge work.hva work.first_last; by event_id; percent_duration = (day_of_event / duration) * 100;run;
![Page 29: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/29.jpg)
Non-Periodic EventsNon-Periodic Eventsof Variable Durationof Variable Duration
**** Time series of high velocity absorption events ****; data work.first_date; set work.hva; by event_id jd_minus_24e5; if first.event_id; first_date = jd_minus_24e5; keep event_id first_date;run; (and similarly for the last date of each event, which are merged with the first date) data work.hva; merge work.hva work.first_last; by event_id; percent_duration = (day_of_event / duration) * 100;run;
![Page 30: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/30.jpg)
Absolute versusAbsolute versusRelative MeasuresRelative Measures
![Page 31: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/31.jpg)
Time Series Clustering Time Series Clustering of Non-Periodic Eventsof Non-Periodic Events
![Page 32: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/32.jpg)
SummaryThe SAS procedures developed for cluster analysis can be applied to events that change over time to identify distinct, successive stages in dynamically evolving systems.
Both cyclical and non-repeating events may be analyzed.
Normalization of fields through the use of PROC STANDARD may be necessary prior to cluster analysis to obtain the best results.
![Page 33: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/33.jpg)
References
Fisher, R.A., 1936, Annals of Eugenics, 7, 2, 179
Tryon, R. C., 1939, Cluster analysis. Ann Arbor: Edwards Brothers
The SAS Institute, Cary, N.C. www.support.sas.com
![Page 34: Dynamically Evolving Systems: Cluster Analysis Using Time](https://reader033.fdocuments.net/reader033/viewer/2022061606/555df5ddd8b42a3f618b4b17/html5/thumbnails/34.jpg)
Questions
David J. Corliss
http://astro1.panet.utoledo.edu/~dcorliss