Spatio-temporal Load Curve Data Cleansing and Imputation via Sparsity and Low Rank

1

Spatio-temporal Load Curve Data Cleansing and Imputation via Sparsity and Low Rank

Gonzalo Mateos and Georgios B. Giannakis

Dept. of ECE and Digital Technology Center University of Minnesota

November 5, 2012

Workshop on Architectures and Models for the Smart Grid

22

Context Robust imputation of network data

Goal: Given few rows per agent, perform distributed cleansing and imputation by leveraging low-rank of the nominal data matrix and sparsity of the outliers.

Network health cartography

Smart metering

Wind farm monitoring

3

Load curve data cleansing Load curve: electric power consumption recorded periodically

Reliable data: key to realize smart grid vision [Hauser’09]

Uruguay’s aggregate power consumption (MW)

Missing data: Faulty meters, communication errors, few PMUs Outliers: Unscheduled maintenance, strikes, sport events [Chen et al’10]

4

Spatio-temporal load profiles Power measured at bus , at time

Spatio-temporal model:

Low-rank nominal load profiles Sparse outliers across buses and time

55

Principal Component Pursuit

Principal component pursuit [Chandrasekaran et al’11], [Candes et al’11]

(as) has low rank, is sparse Goal: Given Y, recover and

Data model

?

?

??

??

? ??

?

?

?

Missing data: set

Sampling operator

66

Distributed processing paradigms

Limitations of FC-based architectures Lack of robustness (isolated point of failure, non-ideal links) High Tx power (as geographical area grows) Less suitable for tracking applications

Incremental

Limitations of incremental processing Non-robust to node failures (Re-) routing? Hamiltonian routes NP-hard to establish

Fusion Center (FC) In-network

77

Problem statement Network of smart meters: undirected, connected graph

(P1)

?

?

??

?

?

?

?

?

?

Challenges Nuclear norm is not separable Global optimization variable

n

Goal: Given per node and single-hop exchanges, findGoal: Given per node and single-hop exchanges, find

88

Separable regularization Key property; e.g., [Recht et al’11]

New formulation equivalent to (P1)

(P2)

Nonconvex; reduces complexity:

Lxρ≥rank[X]

Proposition 1. If stat. pt. of (P2) and ,

then is a global optimum of (P1).

99

Distributed estimator

Network connectivity (P2) (P3)

(P3)

Consensus with neighboring nodes

Alternating-directions method of multipliers (ADMM) solver Method [Glowinski-Marrocco’75], [Gabay-Mercier’76] Learning over networks [Schizas et al’07]

Primal variables per agent :

Message passing:n

1010

Distributed iterations

1111

Highly parallelizable with simple recursions Unconstrained QPs per agent No SVD per iteration [O(Tρ3) complexity]

Low overhead for message exchanges is and is small Comm. cost independent of network size

Recap:(P1) (P2) (P3)

CentralizedConvex

Sep. regul.Nonconvex

ConsensusNonconvex

Stationary (P3) Stationary (P2) Global (P1)

Attractive features

1212

Optimality

Proposition 2. If converges to

and , then:

i)

ii) is the global optimum of (P1).

ADMM can converge even for non-convex problems, e.g.,[Boyd et al’11]

Simple distributed algorithm for principal component pursuit Centralized performance guarantees carry over

13

Synthetic data Random network, N={15,20,25}, T=600

Data , ,

13

1414

NorthWrite data Power consumption of schools, government building, grocery store (’05-’10)

Data: courtesy of NorthWrite Energy Group, provided by Prof. V. Cherkassky (UofM)

Cleansing Imputation

Outliers: “Building operational transition shoulder periods” Prediction error: 6% for 30% missing data (8% for 50%)

1515

Concluding summary

Estimate cleansed nominal load profiles

Load curve data cleansing and imputation

Distributed algorithm with guaranteed performance

Thank You!

Leveraging sparsity and low rank

Principal component pursuit for smart grid monitoring

Identify when and where ‘bad data’ occur

Ongoing research:

Convergence of ADMM for bi-convex costs Real-time (adaptive) algorithms

Spatio-temporal Load Curve Data Cleansing and Imputation via Sparsity and Low Rank

Documents

Transcript of Spatio-temporal Load Curve Data Cleansing and Imputation via Sparsity and Low Rank