Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... ·...

29
Data Analytics and Machine Learning Approaches for Utility Companies Modern Engineering and Technology Seminar 2018 Yannan Sun Data Scientist [email protected]

Transcript of Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... ·...

Page 1: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Data Analytics and Machine Learning Approaches for Utility Companies

Modern Engineering and Technology Seminar 2018

Yannan Sun Data Scientist [email protected]

Page 2: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Outline

• Introduction

• Statistical forecasting (Building energy use)

• Physics-based models (Parameter estimation for distribution systems)

• Statistical classification/anomaly detection (Building event detection)

• Oncor use cases using machine learning

Page 3: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

About Me

• BS in Math at University of Sci. and Tech. of China

• MS in Statistics, PhD in Math at WSU

• Scientist/Senior Scientist at Pacific NW Natl. Lab

• Data Scientist at Oncor Electric Delivery

Page 4: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Oncor – By the Numbers

3.3M AMS Meters

121k T&D Circuit Line Miles

500k Switch Points

250k SCADA Points

1.5 B SCADA X-actions / month

60 TB New Data Records/month

250 TB Current EDW

Data storage

• Open energy delivery platform

• Enabling competitive retail,

generation, and open market in-home services

• Agnostic to generation

technologies or locations

• Focus on reliability and information services

• Need advanced data analytical methods

Page 5: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Oncor’s Analytics Platform

Page 6: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Data Size and Other Problems

Page 7: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Building Energy-Use Forecasting

using Regression Trees

Page 8: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Building Energy-Use Forecasting

• Volume – For one building the data is small, but if we want to mine the entire history of many buildings, the data size grows rapidly.

• Variety – numerical, categorical, ordinal, all kinds might be available from sensors.

• Velocity – hourly data with day ahead predictions, potentially not an issue.

• Veracity – weather reports, sensor malfunctions, lack of sensors

Page 9: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Building Energy-Use Forecasting

Inputs

Output

Holiday

Page 10: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Energy-Use Forecasting- Methods

• Several “Off-the-shelf” methods work well for prediction. – Regression tree, Gaussian Processes, Support Vector Machine (SVM),

(Deep) Neural Network

• Regression trees are fast for training and testing, so it is easy to use with a “rolling” window.

root

a>5 b>2

b<2

a<5

y = c11*a+c12*b

y = c21*a+c22*b

y = c31*a+c32*b

Page 11: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

1-Day Ahead Forecasts – Regression Tree

• One month of initial training data • RMSE is about 12 • The model works well for all prediction windows, but the real dependency

is the weather forecast.

Page 12: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Normalized Error of Forecasts – 1 Day

Holidays

Training set too small

Page 13: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Parameter Estimation for Distribution

Systems

Page 14: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Parameter Estimation for Distribution Systems

• Volume – power injection, voltage, power flow, AMI measurements

• Variety – numerical data (multi-phase, multi-measurement)

• Velocity – SCADA data, 5-minute data generated by GridLAB-D (a power distribution system simulation tool)

• Veracity – bad data, missing data, parameter error, measurement noise

Page 15: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Parameter Estimation using the Kalman Filter

• Keeps track of the estimated state of the system and the variance or uncertainty of the estimate

• Many applications: dynamic positioning, satellite navigation systems, weather forecasting, power system state estimation, etc.

• State-vector augmentation with a Kalman filter

𝑥𝑘 = 𝑓(𝑥𝑘−1, 𝑢𝑘−1, 𝑤𝑘−1)

𝑧𝑘 = ℎ 𝑥𝑘 , 𝑣𝑘

where 𝑥 ≔ 𝑥𝑠 , 𝑥𝑝 with covariance 𝑃 ≔ 𝑃𝑠 00 𝑃𝑝

.

Page 16: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Parameter Estimation using the KF

Estimated parameter error with no measurement noise and single snapshot using residual analysis (rN) and the KF approach

Estimated parameter error with 1% measurement noise and single snapshot using residual analysis (rN) and the KF approach

Page 17: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Parameter Estimation using the KF

Estimated parameter error with 1% measurement noise and single time steps using residual analysis (rN) and the KF approach

Estimated parameter error with 1% measurement noise and 12 combined time steps using residual analysis (rN) and the KF approach

Page 18: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Parameter Estimation using the KF

𝜎(𝑧𝑘) 𝑡𝑐 = 1 𝑡𝑐 = 12 𝑡𝑐 = 24

RA KF RA KF RA KF

0.00% 0.0190 0.0178 0.0097 0.0094 0.0188 0.0188

0.50% 0.0174 0.0150 0.0091 0.0061 0.0180 0.0154

1.00% 0.0119 0.0180 0.0167 0.0190 0.0163 0.0192

1.50% 0.0026 0.0180 0.0048 0.0097 0.0138 0.0192

Mean of PE errors using RA and KF

𝜎(𝑧𝑘) 𝑡𝑐 = 1 𝑡𝑐 = 12 𝑡𝑐 = 24

RA KF RA KF RA KF

0.00% 0.0022 0.0043 0.0168 0.0174 0.0020 0.0041

0.50% 0.0626 0.0054 0.0259 0.0174 0.0132 0.0060

1.00% 0.1262 0.0042 0.0373 0.0039 0.0261 0.0035

1.50% 0.1922 0.0042 0.0595 0.0172 0.0393 0.0034

Standard deviation of PE errors using RA and KF

Page 19: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Building Event Detection using

Bayesian One-Class SVM

Page 20: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Building Event Detection

• Volume – for one building the data is small, but if we want to mine the entire history of many buildings, the data size grows rapidly.

• Variety – numerical, categorical, ordinal, all kinds might be available from sensors.

• Velocity – hourly data generated by EnergyPlus (a whole building energy simulation tool)

• Veracity – weather reports, sensor malfunctions, lack of sensors

Page 21: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Event/Anomaly Detection

• To predict binary outcomes (true/false) we could use supervised learning, this is called a binary classifier.

• The difficulty with this approach for faults is that it is very time-consuming to annotate the training data.

• Incorrect labels are also a problem. What is a fault? Are there unseen/unknown faults?

• To get around the labeling problem we adjust the problem as anomaly detection. This is an unsupervised classification problem.

Page 22: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Bayesian One-Class SVM

• The one-class SVM is used for novelty/outlier detection

• Construct the population density of a dataset and decide whether new points are – Normal: in high population – Abnormal: in low population

• The model is trained using past normal days. Predictions are made for the next week. – Unsupervised – Dynamic (time evolving)

Note that the frontier can be any level set, e.g. the light blue hourglass. Image from http://scikit-learn.org

Page 23: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Building Event Detection – Similarity Metrics

Radial basis function (RBF) Similarity

Euclidean Similarity

RBF wrong parameters

RBF wrong parameters

Page 24: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Building Event Detection– Data

Page 25: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Classification Results

Page 26: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Oncor Use Cases

Page 27: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Meter to Transformer Connectivity Error Detection

Page 28: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Transformer Load Forecast & Over Load Prediction

Page 29: Data Analytics and Machine Learning Approaches for Utility ... - 中 … 2018/Final_YannanSun... · 2018-10-19 · Energy-Use Forecasting- Methods • Several “Off-the-shelf”

Data Analytics and Machine Learning Approaches for Utility Companies

Modern Engineering and Technology Seminar 2018

Yannan Sun Data Scientist [email protected]