Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... ·...
Transcript of Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... ·...
Data Analytics and Machine Learning Approaches for Utility Companies
Modern Engineering and Technology Seminar 2018
Yannan Sun Data Scientist [email protected]
Outline
• Introduction
• Statistical forecasting (Building energy use)
• Physics-based models (Parameter estimation for distribution systems)
• Statistical classification/anomaly detection (Building event detection)
• Oncor use cases using machine learning
About Me
• BS in Math at University of Sci. and Tech. of China
• MS in Statistics, PhD in Math at WSU
• Scientist/Senior Scientist at Pacific NW Natl. Lab
• Data Scientist at Oncor Electric Delivery
Oncor – By the Numbers
3.3M AMS Meters
121k T&D Circuit Line Miles
500k Switch Points
250k SCADA Points
1.5 B SCADA X-actions / month
60 TB New Data Records/month
250 TB Current EDW
Data storage
• Open energy delivery platform
• Enabling competitive retail,
generation, and open market in-home services
• Agnostic to generation
technologies or locations
• Focus on reliability and information services
• Need advanced data analytical methods
Oncor’s Analytics Platform
Data Size and Other Problems
Building Energy-Use Forecasting
using Regression Trees
Building Energy-Use Forecasting
• Volume – For one building the data is small, but if we want to mine the entire history of many buildings, the data size grows rapidly.
• Variety – numerical, categorical, ordinal, all kinds might be available from sensors.
• Velocity – hourly data with day ahead predictions, potentially not an issue.
• Veracity – weather reports, sensor malfunctions, lack of sensors
Building Energy-Use Forecasting
Inputs
Output
Holiday
Energy-Use Forecasting- Methods
• Several “Off-the-shelf” methods work well for prediction. – Regression tree, Gaussian Processes, Support Vector Machine (SVM),
(Deep) Neural Network
• Regression trees are fast for training and testing, so it is easy to use with a “rolling” window.
root
a>5 b>2
b<2
a<5
y = c11*a+c12*b
y = c21*a+c22*b
y = c31*a+c32*b
1-Day Ahead Forecasts – Regression Tree
• One month of initial training data • RMSE is about 12 • The model works well for all prediction windows, but the real dependency
is the weather forecast.
Normalized Error of Forecasts – 1 Day
Holidays
Training set too small
Parameter Estimation for Distribution
Systems
Parameter Estimation for Distribution Systems
• Volume – power injection, voltage, power flow, AMI measurements
• Variety – numerical data (multi-phase, multi-measurement)
• Velocity – SCADA data, 5-minute data generated by GridLAB-D (a power distribution system simulation tool)
• Veracity – bad data, missing data, parameter error, measurement noise
Parameter Estimation using the Kalman Filter
• Keeps track of the estimated state of the system and the variance or uncertainty of the estimate
• Many applications: dynamic positioning, satellite navigation systems, weather forecasting, power system state estimation, etc.
• State-vector augmentation with a Kalman filter
𝑥𝑘 = 𝑓(𝑥𝑘−1, 𝑢𝑘−1, 𝑤𝑘−1)
𝑧𝑘 = ℎ 𝑥𝑘 , 𝑣𝑘
where 𝑥 ≔ 𝑥𝑠 , 𝑥𝑝 with covariance 𝑃 ≔ 𝑃𝑠 00 𝑃𝑝
.
Parameter Estimation using the KF
Estimated parameter error with no measurement noise and single snapshot using residual analysis (rN) and the KF approach
Estimated parameter error with 1% measurement noise and single snapshot using residual analysis (rN) and the KF approach
Parameter Estimation using the KF
Estimated parameter error with 1% measurement noise and single time steps using residual analysis (rN) and the KF approach
Estimated parameter error with 1% measurement noise and 12 combined time steps using residual analysis (rN) and the KF approach
Parameter Estimation using the KF
𝜎(𝑧𝑘) 𝑡𝑐 = 1 𝑡𝑐 = 12 𝑡𝑐 = 24
RA KF RA KF RA KF
0.00% 0.0190 0.0178 0.0097 0.0094 0.0188 0.0188
0.50% 0.0174 0.0150 0.0091 0.0061 0.0180 0.0154
1.00% 0.0119 0.0180 0.0167 0.0190 0.0163 0.0192
1.50% 0.0026 0.0180 0.0048 0.0097 0.0138 0.0192
Mean of PE errors using RA and KF
𝜎(𝑧𝑘) 𝑡𝑐 = 1 𝑡𝑐 = 12 𝑡𝑐 = 24
RA KF RA KF RA KF
0.00% 0.0022 0.0043 0.0168 0.0174 0.0020 0.0041
0.50% 0.0626 0.0054 0.0259 0.0174 0.0132 0.0060
1.00% 0.1262 0.0042 0.0373 0.0039 0.0261 0.0035
1.50% 0.1922 0.0042 0.0595 0.0172 0.0393 0.0034
Standard deviation of PE errors using RA and KF
Building Event Detection using
Bayesian One-Class SVM
Building Event Detection
• Volume – for one building the data is small, but if we want to mine the entire history of many buildings, the data size grows rapidly.
• Variety – numerical, categorical, ordinal, all kinds might be available from sensors.
• Velocity – hourly data generated by EnergyPlus (a whole building energy simulation tool)
• Veracity – weather reports, sensor malfunctions, lack of sensors
Event/Anomaly Detection
• To predict binary outcomes (true/false) we could use supervised learning, this is called a binary classifier.
• The difficulty with this approach for faults is that it is very time-consuming to annotate the training data.
• Incorrect labels are also a problem. What is a fault? Are there unseen/unknown faults?
• To get around the labeling problem we adjust the problem as anomaly detection. This is an unsupervised classification problem.
Bayesian One-Class SVM
• The one-class SVM is used for novelty/outlier detection
• Construct the population density of a dataset and decide whether new points are – Normal: in high population – Abnormal: in low population
• The model is trained using past normal days. Predictions are made for the next week. – Unsupervised – Dynamic (time evolving)
Note that the frontier can be any level set, e.g. the light blue hourglass. Image from http://scikit-learn.org
Building Event Detection – Similarity Metrics
Radial basis function (RBF) Similarity
Euclidean Similarity
RBF wrong parameters
RBF wrong parameters
Building Event Detection– Data
Classification Results
Oncor Use Cases
Meter to Transformer Connectivity Error Detection
Transformer Load Forecast & Over Load Prediction
Data Analytics and Machine Learning Approaches for Utility Companies
Modern Engineering and Technology Seminar 2018
Yannan Sun Data Scientist [email protected]