Spatio-temporal Load Curve Data Cleansing and Imputation via Sparsity and Low Rank
description
Transcript of Spatio-temporal Load Curve Data Cleansing and Imputation via Sparsity and Low Rank
1
Spatio-temporal Load Curve Data Cleansing and Imputation via Sparsity and Low Rank
Gonzalo Mateos and Georgios B. Giannakis
Dept. of ECE and Digital Technology Center University of Minnesota
November 5, 2012
Workshop on Architectures and Models for the Smart Grid
22
Context Robust imputation of network data
Goal: Given few rows per agent, perform distributed cleansing and imputation by leveraging low-rank of the nominal data matrix and sparsity of the outliers.
Network health cartography
Smart metering
Wind farm monitoring
3
Load curve data cleansing Load curve: electric power consumption recorded periodically
Reliable data: key to realize smart grid vision [Hauser’09]
Uruguay’s aggregate power consumption (MW)
Missing data: Faulty meters, communication errors, few PMUs Outliers: Unscheduled maintenance, strikes, sport events [Chen et al’10]
4
Spatio-temporal load profiles Power measured at bus , at time
Spatio-temporal model:
Low-rank nominal load profiles Sparse outliers across buses and time
55
Principal Component Pursuit
Principal component pursuit [Chandrasekaran et al’11], [Candes et al’11]
(as) has low rank, is sparse Goal: Given Y, recover and
Data model
?
?
??
??
? ??
?
?
?
Missing data: set
Sampling operator
66
Distributed processing paradigms
Limitations of FC-based architectures Lack of robustness (isolated point of failure, non-ideal links) High Tx power (as geographical area grows) Less suitable for tracking applications
Incremental
Limitations of incremental processing Non-robust to node failures (Re-) routing? Hamiltonian routes NP-hard to establish
Fusion Center (FC) In-network
77
Problem statement Network of smart meters: undirected, connected graph
(P1)
?
?
??
?
?
?
?
?
?
Challenges Nuclear norm is not separable Global optimization variable
n
Goal: Given per node and single-hop exchanges, findGoal: Given per node and single-hop exchanges, find
88
Separable regularization Key property; e.g., [Recht et al’11]
New formulation equivalent to (P1)
(P2)
Nonconvex; reduces complexity:
Lxρ≥rank[X]
Proposition 1. If stat. pt. of (P2) and ,
then is a global optimum of (P1).
99
Distributed estimator
Network connectivity (P2) (P3)
(P3)
Consensus with neighboring nodes
Alternating-directions method of multipliers (ADMM) solver Method [Glowinski-Marrocco’75], [Gabay-Mercier’76] Learning over networks [Schizas et al’07]
Primal variables per agent :
Message passing:n
1010
Distributed iterations
1111
Highly parallelizable with simple recursions Unconstrained QPs per agent No SVD per iteration [O(Tρ3) complexity]
Low overhead for message exchanges is and is small Comm. cost independent of network size
Recap:(P1) (P2) (P3)
CentralizedConvex
Sep. regul.Nonconvex
ConsensusNonconvex
Stationary (P3) Stationary (P2) Global (P1)
Attractive features
1212
Optimality
Proposition 2. If converges to
and , then:
i)
ii) is the global optimum of (P1).
ADMM can converge even for non-convex problems, e.g.,[Boyd et al’11]
Simple distributed algorithm for principal component pursuit Centralized performance guarantees carry over
13
Synthetic data Random network, N={15,20,25}, T=600
Data , ,
13
1414
NorthWrite data Power consumption of schools, government building, grocery store (’05-’10)
Data: courtesy of NorthWrite Energy Group, provided by Prof. V. Cherkassky (UofM)
Cleansing Imputation
Outliers: “Building operational transition shoulder periods” Prediction error: 6% for 30% missing data (8% for 50%)
1515
Concluding summary
Estimate cleansed nominal load profiles
Load curve data cleansing and imputation
Distributed algorithm with guaranteed performance
Thank You!
Leveraging sparsity and low rank
Principal component pursuit for smart grid monitoring
Identify when and where ‘bad data’ occur
Ongoing research:
Convergence of ADMM for bi-convex costs Real-time (adaptive) algorithms