A NOVEL DNN-HMM-BASED APPROACH FOR EXTRACTING … · A NOVEL DNN-HMM-BASED APPROACH FOR EXTRACTING...

1
A NOVEL DNN-HMM-BASED APPROACH FOR EXTRACTING SINGLE LOADS FROM AGGREGATE POWER SIGNALS Lukas Mauch and Bin Yang Institute of Signal Processing and System Theory, University of Stuttgart, Germany 1. Introduction 1.1 Non-intrusive load monitoring Estimate power consumed by individual loads from aggregate power signal Aggregate signal x(n)= M k =1 x k (n),x k (n) R + consists of real power measurements of up to hundreds of loads Single loads are non-stationary with strong temporal relations from seconds to many hours Established techniques for single channel source separation fail 1.2 Contribution Combination of Deep Neural Network (DNN) and Hidden Markov Model (HMM) for signal extraction from single channel measurements 2. DNN-HMM Network 2.1 Idea s k (n-1) x (n-1) x k (n-1) x (n) x k (n) x (n+1) x k (n+1) s k (n) s k (n+1) N ( μ k s k (n) ,C k s k (n) ) DNN Abbildung 1: The DNN-HMM network for signal ex- traction Use DNN-HMM as generative model for Observation sequence of aggregate signal {x } = {x (1),...,x (N )} Observation sequence of single load {x k } = {x k (1),...,x k (N )} Hidden state sequence {s k } = {s k (1),...,s k (N )} Train one DNN-HMM for each target load contained in aggregate signal For each load, the HMM captures the temporal relations in the signal. The DNN is used to infer the state sequence from the aggregate signal. 2.2 Architecture of HMM Parametrization of the HMM of load k s k (n) ∈{1, ··· ,M k } π k =[P (s k (1) = 1),...,P (s k (1) = M k )] T p k i = p(x k (n)|s k (n)= i) N (μ k i , C k i ) ˜ p k i = p(x (n)|s k (n)= i) A k =[P (s k (n)= i|s k (n - 1) = j )] ij The joint probability distribution of the sequences is p({s k }, {x k }, {x })= π k s k (1) p k s k (1) ˜ p k s k (1) · N n=2 p k s k (n) ˜ p k s k (n) A s k (n),s k (n-1) 2.3 Architecture of DNN x (0) (n) x (1) (n) x (d) (n) x (D) (n) W (1) ,b (1) W (d) ,b (d) W (D) ,b (D) Abbildung 2: Architecture of the D-layer DNN DNN can model very complex conditional density functions Input: current observation plus context in block of length B x (0) (n)=[x T (n - B ),...,x T (n + B )] T Mapping: a (d) (n)= W (d) x (d-1) (n)+ b (d) , 1 d D x (d) (n)=Φ (d) (a (d) (n)) Output: x (D) (n)=[f 1 (x (0) (n)),...,f M k (x (0) (n))] T f i (x (0) (n)) = P (s k (n)= i|x (0) (n)) The DNN is trained to approximate ˜ p s k (n) P (s k (n)|x (0) (n)) P (s k (n)) 2.4 Inference and signal extraction Infer the state sequence from the aggregate signal, using the state posteriors estimated by DNN P ({s k }|{x }k )= 1 Z π k s k (1) ˜ p k s k (1) N n=2 ˜ p k s k (n) A s k (n),s k (n-1) Reconstruct the power consumed by load k from the state sequence ˆ x k (n)= μ k s k (n) 2.5 Supervised training For a given training sequence x t k of load k , maximize p({x t k }|λ k )= {s k } π k s k (1) p k s k (1) N n=2 p k s k (n) A s k (n),s k (n-1) over parameters λ k of the DNN-HMM Infer the most likely state sequence { ˆ s t k } for given {x t k } and all possible {x t } Optimize over the DNN parameters in order to minimize J ({x t k }, {x t })= - N n=1 log M k i=1 A i, ˆ s t k (n-1) ˜ p k i p k i A ˆ s t k (n+1),i = - log N n=1 p(x t k (n),x t (n)| ˆ s t k (n - 1), ˆ s t k (n + 1)) 3. Experiments and results Extract the major loads fridge (FR), dishwasher (DW), microwave (MW) and kitchen outlets (KO) Reference Energy Disaggregation Dataset (REDD) containing 18 loads used E k = 1 F s N n=1 x k (n) ˆ E k = 1 F s N n=1 ˆ x k (n) NAD = N n=1 | ˆ x k (n) - x k (n)| N n=1 |x k (n)| Abbildung 3: Ground truth (filled blue) and extracted (red) signals. Appl. E t ˆ E t NAD Gain FR 4.30 4.26 0.14 10.5 DW 0.93 0.94 0.08 21.1 MW 1.66 1.49 0.27 13.1 KO 0.35 0.34 0.22 21.0 Tabelle 1: Power based metrics ICASSP 2016, March 20-25, Shanghai, China

Transcript of A NOVEL DNN-HMM-BASED APPROACH FOR EXTRACTING … · A NOVEL DNN-HMM-BASED APPROACH FOR EXTRACTING...

Page 1: A NOVEL DNN-HMM-BASED APPROACH FOR EXTRACTING … · A NOVEL DNN-HMM-BASED APPROACH FOR EXTRACTING SINGLE LOADS FROM AGGREGATE POWER SIGNALS Lukas Mauch and Bin Yang Institute of

A NOVEL DNN-HMM-BASED APPROACH FOR EXTRACTING SINGLE

LOADS FROM AGGREGATE POWER SIGNALSLukas Mauch and Bin Yang

Institute of Signal Processing and System Theory, University of Stuttgart, Germany

1. Introduction

1.1 Non-intrusive load monitoring

� Estimate power consumed by individual loads from

aggregate power signal

� Aggregate signal x(n) =∑M

k=1 xk(n), xk(n) ∈ R+

consists of real power measurements of up to

hundreds of loads

0 5 10 15 20 25 30 35time t [h]

0

500

1000

1500

2000

power [W

]

aggregate signal

� Single loads are non-stationary with strong

temporal relations from seconds to many hours

� Established techniques for single channel source

separation fail

1.2 Contribution

� Combination of Deep Neural Network (DNN) and

Hidden Markov Model (HMM) for signal extraction

from single channel measurements

2. DNN-HMM Network

2.1 Idea

sk(n−1)

x(n−1)

xk(n−1)

x(n)

xk(n)

x(n+1)

xk(n+1)

sk(n) sk(n+1)

N(µksk(n),Cksk(n)

)

DNN

Abbildung 1: The DNN-HMM network for signal ex-

traction

� Use DNN-HMM as generative model for

• Observation sequence of aggregate signal

{x} = {x(1), . . . , x(N)}

• Observation sequence of single load

{xk} = {xk(1), . . . , xk(N)}• Hidden state sequence {sk} = {sk(1), . . . , sk(N)}

� Train one DNN-HMM for each target load

contained in aggregate signal

� For each load, the HMM captures the temporal

relations in the signal. The DNN is used to infer the

state sequence from the aggregate signal.

2.2 Architecture of HMM

� Parametrization of the HMM of load k

sk(n) ∈ {1, · · · ,Mk}

πk = [P (sk(1) = 1), . . . , P (sk(1) = Mk)]T

pki = p(xk(n)|sk(n) = i) ∼ N(µk

i,Ck

i )

pki = p(x(n)|sk(n) = i)

Ak = [P (sk(n) = i|sk(n− 1) = j)]ij

� The joint probability distribution of the sequences is

p({sk}, {xk}, {x}) = πksk(1)

pksk(1)pksk(1)

·

N∏

n=2

pksk(n)pksk(n)

Ask(n),sk(n−1)

2.3 Architecture of DNN

x(0)(n) x(1)(n) x(d)(n) x(D)(n)

W(1),b(1) W

(d),b(d) W(D),b(D)

Abbildung 2: Architecture of the D-layer DNN

� DNN can model very complex conditional density

functions

• Input: current observation plus context in block

of length B

x(0)(n) = [xT (n− B), . . . , xT (n + B)]T

• Mapping:

a(d)(n) = W(d)x(d−1)(n) + b(d), 1 ≤ d ≤ D

x(d)(n) = Φ(d)(a(d)(n))

• Output:

x(D)(n) = [f1(x(0)(n)), . . . , fMk

(x(0)(n))]T

fi(x(0)(n)) = P (sk(n) = i|x(0)(n))

� The DNN is trained to approximate

psk(n) ∼P (sk(n)|x

(0)(n))

P (sk(n))

2.4 Inference and signal extraction� Infer the state sequence from the aggregate signal,

using the state posteriors estimated by DNN

P ({sk}|{x}, λk) =1

Zπksk(1)

pksk(1)

N∏

n=2

pksk(n)Ask(n),sk(n−1)

� Reconstruct the power consumed by load k from

the state sequence

xk(n) = µk

sk(n)

2.5 Supervised training� For a given training sequence xtk of load k,

maximize

p({xtk}|λk) =∑

{sk}

πksk(1)

pksk(1)

N∏

n=2

pksk(n)Ask(n),sk(n−1)

over parameters λk of the DNN-HMM

� Infer the most likely state sequence {stk} for given

{xtk} and all possible {xt}

� Optimize over the DNN parameters in order to

minimize

J({xtk}, {xt}) = −

N∑

n=1

log

Mk∑

i=1

Ai,stk(n−1)pki p

kiAstk(n+1),i

= − log

N∏

n=1

p(xtk(n), xt(n)|stk(n− 1), stk(n + 1))

3. Experiments and results

� Extract the major loads fridge (FR), dishwasher

(DW), microwave (MW) and kitchen outlets (KO)

� Reference Energy Disaggregation Dataset (REDD)

containing 18 loads used

Ek =1

Fs

N∑

n=1

xk(n)

Ek =1

Fs

N∑

n=1

xk(n)

NAD =

∑Nn=1 |xk(n)− xk(n)|∑N

n=1 |xk(n)|

7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0time t [h]

0200400600800

100012001400

power [W

]

DW

6 8 10 12 14 16 18time t [h]

0200400600800100012001400

power [W

]

FR

0 1 2 3 4 5time t [h]

0

500

1000

1500

2000

power [W

]

MW

Abbildung 3: Ground truth (filled blue) and extracted

(red) signals.

Appl. Et Et NAD Gain

FR 4.30 4.26 0.14 10.5

DW 0.93 0.94 0.08 21.1

MW 1.66 1.49 0.27 13.1

KO 0.35 0.34 0.22 21.0Tabelle 1: Power based metrics

ICASSP 2016, March 20-25, Shanghai, China