Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series...

45
A. Dachraoui 1,2 , A. Bondu 1 and A. Cornuéjols 2 1 EDF R&D, 2 AgroParisTechINRA MIA 518 (Paris – France) Early ClassificaJon of Time Series as a Non Myopic SequenJal Decision Making Problem

Transcript of Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series...

Page 1: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

A.  Dachraoui1,2,  A.  Bondu1  and  A.  Cornuéjols2  

1  EDF  R&D,  2AgroParisTech-­‐INRA      MIA  518  

(Paris  –  France)  

Early  ClassificaJon  of  Time  Series    

as  a  Non  Myopic  SequenJal  Decision  Making  Problem    

Page 2: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

2 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Outline  

1.  IntroducJon:  a  new  set  of  problems  

2.  Formal  statement  and  a  soluJon  

3.  The  proposed  approach  

4.  Experimental  results  

5.  Conclusions  

Page 3: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

3 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Introduc)on  

Page 4: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

4 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

(Early)  classificaJon  of  Jme  series  

  Monitoring  of  consumer  ac+ons  on  a  web  site:    will  buy      or      not  

  Monitoring  of  a  pa+ent  state:        criJcal          or      not  

  Early  predicJon  of  daily  electrical  consump+on:    high                  or      low  

x(t)

T

Training  set  

Page 5: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

5 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Standard  classifica)on  of  Jme  series  

  What  is  the  class  of  the  new  Jme  series  xT?  

x(t)

!

T

Page 6: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

6 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Early  classifica)on  of  Jme  series  

  What  is  the  class  of  the  new  incomplete  Jme  series  xt?  

x(t)

T

!

t

Page 7: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

7 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

New  set  of  decision  problems  :  early  classificaJon  

  Data  stream  

  Classifica)on  task  

  As  early  as  possible  

  A  trade-­‐off  

–  ClassificaJon  performance    (be[er  if  t              )  

–  Cost  of  delaying  predicJon    (be[er  if  t              )  

Page 8: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

8 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Previous  works  

  Sequen'al  decision  making        (1933,  1948)  

–  Wald’s  sequen+al  probability  ra+o  test  

Page 9: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

9 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Previous  works  

  Sequen'al  decision  making        (1933,  1948)  

–  Wald’s  sequen+al  probability  ra+o  test  

•  Difficult  to  esJmate  

•  Have  to  set  the  thresholds  α  and  β      

•  No  explicit  cost  of  delaying  decision  •  Myopic  decision  criterion  

Page 10: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

10 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Previous  works  

  Sequen'al  decision  making        (1933,  1948)  

–  Wald’s  sequen+al  probability  ra+o  test  

•  Difficult  to  esJmate  

•  Have  to  set  the  thresholds  α  and  β      

•  No  explicit  cost  of  delaying  decision  •  Myopic  decision  criterion  

  Other  (modern)  approaches  

•  SophisJcated  heurisJcs    

•  No  explicit  cost  of  delaying  decision  •  Myopic  decision  criterion  

[Ishiguro  et  al.  2000]  [Sochman  &  Matas,  2005]  [Xing  et  al.,  2009,  2011]    [Anderson  et  al.,  2012]  

[Parrish  et  al.,  2013]  [Hatami  &  Chira,  2013]  [Ghalwash  et  al.,  2014]  …  

Page 11: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

11 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Our  contribuJon  

1.    Formalize  the  problem  as  a  sequenJal  decision  making  problem  

2.    Explicit  trade-­‐off  –  ClassificaJon  performance  

–  Cost  of  delaying  the  decision  

3.    Adap)ve    –  Takes  into  account  the  peculiariJes  of  xt    

4.    Non  myopic  

–  At  each  Jme  step,  esJmates  the  expected  future  Jme  for  opJmal  decision  

An  algorithm  

that  realizes  all  

these  items  

Page 12: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

12 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Formal  analysis  and  first  a[empt  

Page 13: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

13 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Decision  making  (1)    

  Given  an  incoming  sequence  

  And  given:  

–  A  miss-­‐classifica'on  cost  func'on  

–  A  delaying  decision  cost  func'on    

  What  is  the  op)mal  )me  to  make  a  decision?    

Page 14: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

14 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Decision  making  (1)    

  Given  an  incoming  sequence  

  And  given:  

–  A  miss-­‐classifica'on  cost  func'on  

–  A  delaying  decision  cost  func'on    

  What  is  the  op)mal  )me  to  make  a  decision?    

Expected  cost  for  a  decision  at  Jme  t    

Page 15: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

15 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Decision  making  (1)    

  Given  an  incoming  sequence  

  And  given:  

–  A  miss-­‐classifica'on  cost  func'on  

–  A  delaying  decision  cost  func'on    

  What  is  the  op)mal  )me  to  make  a  decision?    

Expected  cost  for  a  decision  at  Jme  t    

OpJmal  Jme:      

Page 16: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

16 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Decision  making  (1)    

  Given  an  incoming  sequence  

  And  given:  

–  A  miss-­‐classifica'on  cost  func'on  

–  A  delaying  decision  cost  func'on    

  What  is  the  op)mal  )me  to  make  a  decision?    

Expected  cost  for  a  decision  at  Jme  t    

OpJmal  Jme:      

?

Page 17: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

17 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Decision  making  (2)    

  Expected  cost  for  any  sequence  of  length  t    

Page 18: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

18 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Decision  making  (2)    

  Expected  cost  for  any  sequence  of  length  t    

Computable  from  a  training  set    

Page 19: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

19 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Decision  making  (2)    

  Expected  cost  for  any  sequence  of  length  t    

Computable  from  a  training  set    

  Not  adapJve!!    

Page 20: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

20 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Challenge  

  How  can  we  get  an  opJmizaJon  criterion  st.  

–  It  takes  into  account  the  incoming  sequence  xt  (adap)ve)  

–  It  can  be  easily  es)mated  (from  the  training  set)  

Page 21: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

21 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

The  proposed  approach  

Page 22: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

22 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

The  principle  

Page 23: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

23 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

The  principle  

1.    During  training:    

–  idenJfy  meaningful  subsets  of  Jme  sequences  in  the  training  set:  ck    

Clustering  

Page 24: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

24 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

The  principle  

1.    During  training:    

–  idenJfy  meaningful  subsets  of  Jme  sequences  in  the  training  set:  ck    

–  For  each  of  these  subsets  ck,  and  for  each  Jme  step  t    

•  EsJmate  the  confusion  matrices    Clustering  

–  T  classifiers  are  learnt  

–  And  their  confusion  matrices                                                  are  es)mated  on  a  test  set        

… …c1

ck … …

Page 25: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

25 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

The  principle  

1.    During  training:    

–  idenJfy  meaningful  subsets  of  Jme  sequences  in  the  training  set:  ck    

–  For  each  of  these  subsets  ck,  and  for  each  Jme  step  t    

•  EsJmate  the  confusion  matrices    

2.    Tes)ng:  For  any  new  incomplete  incoming  sequence  xt  –  IdenJfy  the  most  likely  subset:  the  closer  class  of  shapes  to  xt  

Clustering  

Membership  probability  

Page 26: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

26 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

The  principle  

1.    During  training:    

–  idenJfy  meaningful  subsets  of  Jme  sequences  in  the  training  set:  ck    

–  For  each  of  these  subsets  ck,  and  for  each  Jme  step  t    

•  EsJmate  the  confusion  matrices    

2.    Tes)ng:  For  any  new  incomplete  incoming  sequence  xt  –  IdenJfy  the  most  likely  subset:  the  closer  shape  to  xt  

–  Compute  the  expected  cost  of  decision  for  all  future  Jme  steps  

Clustering  

Page 27: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

27 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

A  non  myopic  decision  process  

  OpJmal  esJmated  Jme  relaJve  to  current  Jme  t    

0 T-t

t T

X

ConJnue  monitoring  

Page 28: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

28 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

A  non  myopic  decision  process  

  OpJmal  esJmated  Jme  relaJve  to  current  Jme  t    

t+1 T

X

0 T-ti

T-(t+1) 0

ConJnue  monitoring  

Page 29: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

29 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

A  non  myopic  decision  process  

  OpJmal  esJmated  Jme  relaJve  to  current  Jme  t    

X

0 T-ti

T-(t+1) 0

t+2 T

=0 T-(t+2)

Take  decision  

Page 30: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

30 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

ProperJes  

  The  decision  criterion  naturally  incorporates    

–  The  quality      of  the  decision  

–  The  cost  of  delaying    decision  

  Adap)ve:  the  decision  depends  upon  xt    

  Non  myopic:    

–  At  each  Jme  step  the  expected  best  Jme  for  decision  is  esJmated  

Page 31: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

31 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

A  simple  implementaJon  

Page 32: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

32 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

A  baseline  implementaJon:  simple  and  direct  

  The  distance  used  to  measure  proximity  between  sequences  is  the  

Euclidian  distance  

–  Clustering  of  sequences  

–  Clustering  membership  

Page 33: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

33 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

A  baseline  implementaJon:  simple  and  direct  

  The  distance  used  to  measure  proximity  between  sequences  is  the  

Euclidian  distance  

–  Clustering  of  sequences  

–  Clustering  membership  

  Classifiers  

–  Naïve  Bayes  

–  MulJ-­‐Layer  Perceptrons  

Page 34: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

34 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

A  baseline  implementaJon:  simple  and  direct  

  The  distance  used  to  measure  proximity  between  sequences  is  the  

Euclidian  distance  

–  Clustering  of  sequences  

–  Clustering  membership  

  Classifiers  

–  Naïve  Bayes  

–  MulJ-­‐Layer  Perceptrons  

Other  choices  are  possible  within  the  general  approach  

Page 35: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

35 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Experiments  

Page 36: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

36 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Controlled  data  

  Control  of  –  The  proximity  between  the  classes  

–  The  number  and  shapes  of  clusters  within  each  class  

–  The  noise  level  

Page 37: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

37 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Results  

Page 38: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

38 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Results:  effect  of  the  noise  level  

Increasing  the  noise  

level  increases  the  

waiJng  Jme,  and  then  

it’s  no  longer  worth  it  

12 Asma Dachraoui et al.

±b 0.02 0.05 0.07C(t)

ε(t) τ� σ(τ�) AUC τ� σ(τ�) AUC τ� σ(τ�) AUC

0.01

0.2 9.0 2.40 0.99 9.0 2.40 0.99 10.0 0.0 1.000.5 13.0 4.40 0.98 13.0 4.40 0.98 15.0 0.18 1.001.5 24.0 10.02 0.98 32.0 2.56 1.00 30.0 12.79 0.995.0 26.0 7.78 0.84 30.0 18.91 0.87 30.0 19.14 0.8810.0 38.0 18.89 0.70 48.0 1.79 0.74 46.0 5.27 0.7515.0 23.0 15.88 0.61 32.0 13.88 0.64 29.0 17.80 0.6220.0 7.0 8.99 0.52 11.0 11.38 0.55 4.0 1.22 0.52

0.05

0.2 8.0 2.00 0.98 8.0 2.00 0.98 9.0 0.0 1.000.5 10.0 2.80 0.96 8.0 4.0 0.98 14.0 0.41 0.991.5 5.0 0.40 0.68 20.0 0.42 0.95 14.0 4.80 0.885.0 8.0 3.87 0.68 6.0 1.36 0.64 5.0 0.50 0.6510.0 4.0 0.29 0.56 4.0 0.25 0.56 4.0 0.34 0.5715.0 4.0 0.0 0.54 4.0 0.25 0.56 4.0 0.0 0.5520.0 4.0 0.0 0.52 4.0 0.0 0.52 4.0 0.0 0.52

0.10

0.2 6.0 0.80 0.95 7.0 1.60 0.94 8.0 0.40 0.960.5 6.0 0.80 0.84 9.0 2.40 0.93 10.0 0.0 0.951.5 4.0 0.0 0.67 5.0 0.43 0.68 6.0 0.80 0.745.0 4.0 0.07 0.64 4.0 0.05 0.64 4.0 0.11 0.6410.0 4.0 0.0 0.56 48.0 1.79 0.74 4.0 0.22 0.5615.0 4.0 0.0 0.55 4.0 0.0 0.55 4.0 0.0 0.5520.0 4.0 0.0 0.52 11.0 11.38 0.55 4.0 0.0 0.52

Table 1. Experimental results in function of the waiting cost C(t) = {0.01, 0.05, 0.1}×t, the noise level ε(t) and the trend parameter b.

Page 39: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

39 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Results:  effect  of  the  waiJng  cost  

Increasing  the  

wai)ng  cost    

reduces  the  waiJng  

Jme  

Early Classification of Time Series 13

Effect of the waiting cost

±b 0.02 0.05 0.07C(t)

ε(t) τ� σ(τ�) AUC τ� σ(τ�) AUC τ� σ(τ�) AUC

0.01

0.2 9.0 2.40 0.99 9.0 2.40 0.99 10.0 0.0 1.000.5 13.0 4.40 0.98 13.0 4.40 0.98 15.0 0.18 1.001.5 24.0 10.02 0.98 32.0 2.56 1.00 30.0 12.79 0.995.0 26.0 7.78 0.84 30.0 18.91 0.87 30.0 19.14 0.8810.0 38.0 18.89 0.70 48.0 1.79 0.74 46.0 5.27 0.7515.0 23.0 15.88 0.61 32.0 13.88 0.64 29.0 17.80 0.6220.0 7.0 8.99 0.52 11.0 11.38 0.55 4.0 1.22 0.52

0.05

0.2 8.0 2.00 0.98 8.0 2.00 0.98 9.0 0.0 1.000.5 10.0 2.80 0.96 8.0 4.0 0.98 14.0 0.41 0.991.5 5.0 0.40 0.68 20.0 0.42 0.95 14.0 4.80 0.885.0 8.0 3.87 0.68 6.0 1.36 0.64 5.0 0.50 0.6510.0 4.0 0.29 0.56 4.0 0.25 0.56 4.0 0.34 0.5715.0 4.0 0.0 0.54 4.0 0.25 0.56 4.0 0.0 0.5520.0 4.0 0.0 0.52 4.0 0.0 0.52 4.0 0.0 0.52

0.10

0.2 6.0 0.80 0.95 7.0 1.60 0.94 8.0 0.40 0.960.5 6.0 0.80 0.84 9.0 2.40 0.93 10.0 0.0 0.951.5 4.0 0.0 0.67 5.0 0.43 0.68 6.0 0.80 0.745.0 4.0 0.07 0.64 4.0 0.05 0.64 4.0 0.11 0.6410.0 4.0 0.0 0.56 48.0 1.79 0.74 4.0 0.22 0.5615.0 4.0 0.0 0.55 4.0 0.0 0.55 4.0 0.0 0.5520.0 4.0 0.0 0.52 11.0 11.38 0.55 4.0 0.0 0.52

Table 2. Experimental results in function of the waiting cost C(t) = {0.01, 0.05, 0.1}×t, the noise level ε(t) and the trend parameter b.

Page 40: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

40 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Results:  effect  of  the  difference  between  classes  

Increase  of  the  

difference  between  

classes  

The  performance  

increases  (AUC)  

The  wai'ng  'me  is  not  

much  changed  in  these  

experiments  

14 Asma Dachraoui et al.

Effect of the difference between the classes

±b 0.02 0.05 0.07C(t)

ε(t) τ� σ(τ�) AUC τ� σ(τ�) AUC τ� σ(τ�) AUC

0.01

0.2 9.0 2.40 0.99 9.0 2.40 0.99 10.0 0.0 1.000.5 13.0 4.40 0.98 13.0 4.40 0.98 15.0 0.18 1.001.5 24.0 10.02 0.98 32.0 2.56 1.00 30.0 12.79 0.995.0 26.0 7.78 0.84 30.0 18.91 0.87 30.0 19.14 0.8810.0 38.0 18.89 0.70 48.0 1.79 0.74 46.0 5.27 0.7515.0 23.0 15.88 0.61 32.0 13.88 0.64 29.0 17.80 0.6220.0 7.0 8.99 0.52 11.0 11.38 0.55 4.0 1.22 0.52

0.05

0.2 8.0 2.00 0.98 8.0 2.00 0.98 9.0 0.0 1.000.5 10.0 2.80 0.96 8.0 4.0 0.98 14.0 0.41 0.991.5 5.0 0.40 0.68 20.0 0.42 0.95 14.0 4.80 0.885.0 8.0 3.87 0.68 6.0 1.36 0.64 5.0 0.50 0.6510.0 4.0 0.29 0.56 4.0 0.25 0.56 4.0 0.34 0.5715.0 4.0 0.0 0.54 4.0 0.25 0.56 4.0 0.0 0.5520.0 4.0 0.0 0.52 4.0 0.0 0.52 4.0 0.0 0.52

0.10

0.2 6.0 0.80 0.95 7.0 1.60 0.94 8.0 0.40 0.960.5 6.0 0.80 0.84 9.0 2.40 0.93 10.0 0.0 0.951.5 4.0 0.0 0.67 5.0 0.43 0.68 6.0 0.80 0.745.0 4.0 0.07 0.64 4.0 0.05 0.64 4.0 0.11 0.6410.0 4.0 0.0 0.56 48.0 1.79 0.74 4.0 0.22 0.5615.0 4.0 0.0 0.55 4.0 0.0 0.55 4.0 0.0 0.5520.0 4.0 0.0 0.52 11.0 11.38 0.55 4.0 0.0 0.52

Table 3. Experimental results in function of the waiting cost C(t) = {0.01, 0.05, 0.1}×t, the noise level ε(t) and the trend parameter b.

Page 41: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

41 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Results:  the  method  is  adapJve  

  The  expected  opJmal  decision  Jme  depends  on  the  incoming  sequence  

Page 42: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

42 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Real  dataset  

  TwoLeadECG  (UCI  repository)  

  1,162  signals  of  81  measurements  each  (81  minutes)  

  Two  classes    

  We  arbitrarily  varied  the  waiJng  cost:    

•  C(t)  =  0.01  .    t        (cheap)  

•  C(t)  =  0.05  .    t        (costly)  

•  C(t)  =  0.10  .    t        (very  costly)  

Adapts  to  keep  a  

good  performance  

with  fewer  

measurements  

Early Classification of Time Series 17

±b 0.02 0.05 0.07(K−1,K+1)

ε(t) τ� σ(τ�) AUC τ� σ(τ�) AUC τ� σ(τ�) AUC

(3,2)

0.2 9.0 2.40 0.99 9.0 2.40 0.99 10.0 0.0 1.000.5 13.0 4.40 0.98 13.0 4.40 0.98 15.0 0.18 1.001.5 24.0 10.02 0.98 32.0 2.56 1.00 30.0 12.79 1.005.0 26.0 7.78 0.84 30.0 18.90 0.87 30.0 19.14 0.8810.0 38.0 18.89 0.70 48.0 1.79 0.74 46.0 5.27 0.7515.0 23.0 15.88 0.61 32.0 13.88 0.64 29.0 17.80 0.6220.0 7.0 8.99 0.52 11.0 11.38 0.55 4.0 1.22 0.52

(3,5)

0.2 7.0 2.47 0.86 7.0 2.15 0.89 7.0 3.00 0.850.5 11.0 5.10 0.87 10.0 4.87 0.88 14.0 7.07 0.911.5 20.0 12.69 0.85 18.0 11.80 0.87 26.0 16.33 0.895.0 44.0 4.75 0.83 46.0 2.81 0.87 38.0 11.49 0.8110.0 42.0 6.34 0.67 39.0 7.59 0.68 25.0 8.57 0.6115.0 28.0 5.99 0.58 32.0 6.51 0.59 19.0 10.12 0.5820.0 17.0 11.72 0.50 13.0 10.72 0.56 17.0 5.93 0.55

Table 5. Experimental results in function of the noise level ε(t), the trend parameterb, and the number of subgroups k+1 and k−1 in each class. The waiting cost C(t) isfixed to 0.01.

where d ∈ {0.01, 0.05, 0.1}. The question here is whether the method is able tomake reliable prediction early and provide reasonable results.

Table (6) reports the average of optimal times of decision τ� of test timeseries, its associated standard deviation σ(τ�), and the performance of the pre-diction AUC. It is remarkable that a very good performance, as measured by theAUC, can be obtained from a limited set of measurements: E.g. 22 out of 81 ifC(t) = 0.01, 24 out of 81 if C(t) = 0.05, and 10 out of 81 if C(t) = 0.1.

C(t) 0.01 0.05 0.1τ� 22.0 24.0 10.0

σ(τ�) 6.1 15.7 9.8AUC 0.99 0.99 0.91

Table 6. Experimental results on real data in function of the waiting cost C(t).

We therefore see that the baseline solution proposed here is able to (1) adaptto each incoming sequence and (2) to predict an estimated optimal time ofprediction that yields very good prediction performances while controlling thecost of delay.

Page 43: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

43 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Conclusions  

Page 44: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

44 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

Conclusions  

  Online  classifica)on  of  data  streams  is  increasingly  important  

  Contribu)on  

–  A  new  op)miza)on  criterion  incorporaJng    

•  ClassificaJon  performance  

•  Cost  of  delaying  decision  

–  A  baseline  method  

•   AdapJve    •   Non  myopic  

•   A  spectrum  of  different  implementaJons  is  possible  

–  Experimental  results  show  the  promise  of  the  method  

Page 45: Early#Classificaon#of# Time#Series## - AgroParisTech · Early Classification of Time Series (A3.BCD, ECML-PKDD-15) 4 / 45 (Early)#classificaon#of#Jme#series# Monitoring#of#consumer)ac+onson)a)web)site:

45 / 45 Early Classification of Time Series (A3. BCD, ECML-PKDD-15)

PerspecJves  

  Explora)on  of  the  spectrum  of  variaJons  

1.   BeTer  clustering  method  of  the  training  sequences  

•  More  informed  distance  

2.    A  more  direct  approach  without  clustering  on  sequences  

3.    BeTer  classifier  of  incomplete  sequences  

  Applica)on  to  electrical  grid  data