Post on 17-Aug-2020
Solar Nowcasting with Cluster-based Detrending
Antonio Sanfilippo, Luis Pomares, Daniel Perez-Astudillo, Nassma Mohandes, Dunia Bachour
ICEM 2017 – Oral Presentation26-29June 2017, Bari, Italy
Overview
• Problem Statement• Background• Hypothesis• Approach & Data• Results• Next Steps
Problem Statement
• Solar forecasting is crucial in managing PV integration• Forward commitment of generation units (intra-day and day ahead)• Variable generation ramps (minutes/hours ahead)• PV integration in the power distribution system• Transmission congestion management • Energy trading …. and more
• Challenges for Qatar: Variability due to:• Cloudiness during half of the year• Micro-climates due to sea-land climatic interactions• Aerosols in the atmosphere
• Emissions from industrial and urban land use • High loads of dust in the atmosphere
Solar variability in Qatar by hour & day of the year
2014 QEERI dataset from Villa F
How to improve solar forecasting?
• Use stochastic models for ≤ 6 hours predictions, and physics-based models for longer predictions1
• Use multi-modeling2 and ensemble machine learning3 to combine stochastic models
• Integrate predictions from physics-based models into stochastic models
• De-trending – our focus in this study– Group time series data into coherent subsets to train more
accurate stochastic solar forecasting models– QUESTION: WHAT IS THE BEST DE-TRENDING METHOD?
• The use of multi-model classifiers
1Diagne et al. 2013, Inman et al. 2013; 2Sanfilippo et al. 2015; 3Lauret et al 2012
Hypothesis
• Using data mining techniques to cluster solar time series data creates datasets that have stronger internal coherence as compared to other approaches– Training datasets with stronger internal
coherence help training more accurate forecasting models
Approach
• Partition time series of solar irradiance data according to variability using season-based and clustering methods
• Assess the relative performance of each de-trending method by evaluating the same forecasting algorithms with different data partitions
• Develop a technique to identify the class of each solar irradiance time series so that each can be matched with the appropriate forecasting solution
De-trending approaches
1. K-means clustering2. X-Means clustering3. Cascade Simple K-Means clustering4. M-Tree clustering5. EM (expectation maximization)
clustering6. LVQ clustering
Forecasting Focus: Near-real time
• Use regression to learn model coefficients which provide the basis for prediction by measuring the relation between an observation at time t and observations at previous times– Persistence– Autoregressive models. AR(3) and AR(11).– NN1.. NN5. (Feedforward NN)– ARNX (Autoregressive network with
exogenous inputs)– RNN (Layer recurrent neural network)
Data collection equipment
• Radiometric ground monitoring station– Secondary Standard
Pyranometers for measuring GHI and DHI
– First Class pyrheliometer for measuring DNI
• Installed on rooftop of office building in Doha’s Education City – Lat: 25.33o N, Lon: 51.42o E
• Daily maintenance
Data Collected• 5-minute averages of direct
(DNI), horizontal (DHI) and global (GHI) irradiance measured in W/m2 over one year (2014)
• Baseline Surface Radiation Network and Long quality control checks– Extremely rare limits, physical
limits, consistency checks– Other advance filters
developed to address limitations of BSRN
Choosing a forecasting target: Ktp
• GHI is the relevant measure for PV (Pelland et al. 2013)• The Clearness Index (Kt) is used to quantify the impact of the
atmosphere on GHI– Kt = ratio of GHI to the corresponding irradiance out of the
atmosphere
• Normalize Kt (Ktp) to alleviate the dependence of Kt on zenith angle – Normalize Kt with respect to
a standard clear-sky global irradiance profile for a relative air mass of one
Novel de-trending approach based on data mining• Use clustering to detect latent classes in solar irradiance
time series data, and classification to evaluate the detected classes– Use expectation–maximization to cluster the QEERI 2014
dataset of 5-minute averages of normalized clearness index
– Train & evaluate a Bayesian classifier with the clustered dataset
Month Day Hour Min KTp1 … KTp12
… … … … … … …1 1 6 45 0.86 … 0.741 1 6 50 0.84 … 0.72… … … … … … …
A Data-mining approach to de-trending
Cluster 2
Train classifier that assigns each record to its cluster
75%
of
clus
ter d
ata
BN Classification
Apply classifier:
𝐹1 = 2 ∗𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑟𝑒𝑐𝑎𝑙𝑙𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇𝑃𝑠
𝑇𝑃𝑠 + 𝐹𝑃𝑠
𝑟𝑒𝑐𝑎𝑙𝑙 =𝑇𝑃𝑠
𝑇𝑃𝑠 + 𝐹𝑁𝑠
Evaluation: 96% F1
25%
of
clus
ter
data
Full year solar
irradiance dataset
Cluster 1
Cluster 2
Cluster 3
Cluster 4
Clustering
Forecasting with de-trending
• Train and evaluate Autoregressive AR11, AR3 Artificial Neural Network (ANN), NARX and RRN models that provide 12 steps-ahead predictions at 5 minutes intervals– Use two types of de-trended datasets for training
• 4 seasons• 6 type of clusters. 4 clusters for all. 1..10 for EM cluster
– Train & evaluate models with full year dataset to verify the effects of de-trending
– Use persistence as baseline for model comparison– Use several evaluation metrics – only rRMSE shown here
Forecasting results for the whole year
ERRORSMBE RMSE MAE rMBE rRMSE rMAE
NARX 0.00 0.06 0.03 -0.13 8.56 4.07RNN 0.00 0.08 0.04 -0.01 12.70 6.60ANN5 0.00 0.08 0.04 -0.05 12.72 6.60ANN4 0.00 0.08 0.04 -0.02 12.78 6.64ANN3 0.00 0.09 0.04 0.07 12.84 6.71ANN2 0.00 0.09 0.05 -0.06 13.26 7.15ANN1 0.00 0.10 0.06 -0.07 14.57 8.47AR(11) -0.01 0.12 0.08 -1.47 18.69 11.46AR(3) -0.02 0.13 0.08 -2.81 19.24 11.79PER 0.00 0.13 0.08 0.00 19.63 11.38
Forecasting results for the whole year
Forecasting results with de-seasoningERRORS
MBE RMSE MAE rMBE rRMSE rMAE
NARX 0.00 0.06 0.03 -0.11 9.80 4.87RNN 0.00 0.09 0.05 -0.18 13.64 7.92ANN5 0.00 0.10 0.06 0.15 14.86 8.75ANN4 0.00 0.09 0.05 -0.15 13.87 8.05ANN3 0.00 0.10 0.06 -0.17 14.43 8.39ANN2 0.02 0.10 0.06 -0.11 14.27 8.37ANN1 -0.01 0.12 0.07 -1.40 17.00 9.72AR(11) -0.01 0.13 0.08 -1.62 19.16 11.95AR(3) -0.02 0.13 0.09 -3.37 19.97 12.63PER 0.00 0.13 0.08 0.00 19.63 11.38
Forecasting results with de-seasoning
Forecasting results with all clusters. 4 groups
Average rRMSE (%)
KMeans EM VQ MTree CascadeKmeans XMeans
NARX 11.30 10.21 10.68 10.49 11.32 11.62RNN 12.90 12.87 12.81 12.89 12.82 12.69ANN5 12.72 12.78 12.85 12.94 12.78 12.79ANN4 12.91 12.73 12.74 12.84 12.67 12.93ANN3 12.90 12.72 12.92 12.88 12.94 12.69ANN2 13.05 13.02 13.00 13.04 12.90 12.78ANN1 13.84 13.71 13.52 13.84 13.39 13.41PER 19.63
Forecasting results with all clusters. 4 groups
Average rRMSE (%)
KMeans EM VQ MTree CascadeKmeans XMeans
NARX 11.30 10.21 10.68 10.49 11.32 11.62RNN 12.90 12.87 12.81 12.89 12.82 12.69ANN5 12.72 12.78 12.85 12.94 12.78 12.79ANN4 12.91 12.73 12.74 12.84 12.67 12.93ANN3 12.90 12.72 12.92 12.88 12.94 12.69ANN2 13.05 13.02 13.00 13.04 12.90 12.78ANN1 13.84 13.71 13.52 13.84 13.39 13.41PER 19.63
Forecasting results with EM cluster and NARXErrors for EM cluster and NARX
MBE RMSE MAE rMBE rRMSE rMAE
Whole year 0.00 0.06 0.03 -0.13 8.56 4.07Cluster 2 -0.02 0.13 0.08 -3.72 19.00 12.60Cluster 3 0.00 0.07 0.03 0.15 9.94 4.79Cluster 4 0.00 0.07 0.03 -0.06 10.40 5.15Cluster 6 0.00 0.07 0.03 0.00 10.63 5.14Cluster 7 0.00 0.07 0.04 0.07 11.25 5.46Cluster 8 0.00 0.08 0.04 0.08 11.41 5.61Cluster 9 0.00 0.08 0.04 0.07 11.34 5.61Cluster 10 0.00 0.08 0.04 0.08 12.11 6.12PER 0.00 0.13 0.08 0.00 19.63 11.38
Forecasting results with EM cluster and NARXErrors for EM cluster and NARX
MBE RMSE MAE rMBE rRMSE rMAE
Whole year 0.00 0.06 0.03 -0.13 8.56 4.07Cluster 2 -0.02 0.13 0.08 -3.72 19.00 12.60Cluster 3 0.00 0.07 0.03 0.15 9.94 4.79Cluster 4 0.00 0.07 0.03 -0.06 10.40 5.15Cluster 6 0.00 0.07 0.03 0.00 10.63 5.14Cluster 7 0.00 0.07 0.04 0.07 11.25 5.46Cluster 8 0.00 0.08 0.04 0.08 11.41 5.61Cluster 9 0.00 0.08 0.04 0.07 11.34 5.61Cluster 10 0.00 0.08 0.04 0.08 12.11 6.12PER 0.00 0.13 0.08 0.00 19.63 11.38
Forecasting results with EM cluster and NARX
0
2
4
6
8
10
12
14
16
18
20
1 2 3 4 5 6 7 8 9 10
% R
RM
SD
NUMBER OF CLUSTERS
Forecasting results with clustering de-trending
• Overall NARX performs significantly better with the whole data set, seasons and clusters
• Non-stationary data concentrated in a single cluster (3). • AR model is strongly affected by time series discontinuity
(clusters 2-4)
Average rRMSEWhole year Cluster1 Cluster2 Cluster3 Cluster4
NARX 8.56% 3.80% 11.32% 17.04% 4.83%AR(11) 18.69% 2.42% 47.95% 95.44 % 36.25 %Persistence 19.63%
Average standard deviation By horizon 2.04% 3.62% 18.38% 1.63%By time series 0.77% 2.10% 10.01% 0.76%
Season-based vs. Clustering de-trending
• Clustering de-trending helps separate the clusters with higher complexity– The average rRMSE across cluster 1-2 results for the NARX
model is lower than the average rRMSE across season and whole year results
• The concentration of variability in the same cluster (3) may help find solutions for further performance improvements
Solar forecasting error with season vs. clustering de-trending (NARX)All Year Season1 Season2 Season3 Season4 Avg. S1-S4
8.56%11.3% 9.12% 7.96% 10.8% 9.80%
Cluster1 Cluster2 Cluster 3 Avg. C1-C34.44% 8.10% 16.08% 9.94%
Conclusions
• Solar forecasting is needed to manage PV integration• Statistical and AI approaches can be useful, but no
single model can provide the best performance for all inputs
• Clustering de-trending provides an optimal “divide and conquer” technique to improve solar forecasts but other instruments like sky cameras needs to be used to improve the predictions of the clusters with higher complexity– Use cluster de-trending as a diagnostic to identify data
partitions for which “more complex” modeling techniques are needed
Next steps1. Use the predictions of the clusters with a classifier