1 Mining Relationships Among Interval-based Events for Classification Dhaval Patel 、 Wynne Hsu...

19
1 Mining Relationships Among Interval-based Events for Classification Dhaval Patel Wynne Hsu Mong Li Lee SIGMOD 08

Transcript of 1 Mining Relationships Among Interval-based Events for Classification Dhaval Patel 、 Wynne Hsu...

  • Slide 1
  • 1 Mining Relationships Among Interval-based Events for Classification Dhaval Patel Wynne Hsu Mong Li Lee SIGMOD 08
  • Slide 2
  • 2 Outline. Introduction Preliminaries Augment hierarchical representation Interval-based event mining Interval-based event classifier Experiment Conclusion
  • Slide 3
  • 3 Introduction. Predicts categorical class labels Classifies data (constructs a model) based on the training set and the values (class labels) in a classifying attribute and uses it in classifying new data A Two-Step Process Model construction Model usage
  • Slide 4
  • 4 Introduction. (cont)
  • Slide 5
  • 5
  • Slide 6
  • 6 age? overcast student?credit rating? 40 noyes 31..40 no fairexcellent yesno
  • Slide 7
  • 7 Preliminaries. E = (type, start, end) EL = {E 1, E 2, .., E n } The length of EL, given by |EL| is the number of events in the list. Composite event E = (E i R E j ) The start time of E is given by min{ E i.start, E j.start } end time is max{E i.end, E j.end }
  • Slide 8
  • 8 Augment hierarchical representation. Before Meet Overlap Start Finish Contain Equal
  • Slide 9
  • 9 Augment hierarchical representation (cont.) ((A overlap B) overlap C) 1.2. (A Overlap[0,0,0,1,0] B) Overlap[0,0,0,1,0] C C = contain count F = nish by count M = meet count O=overlap count S = start count
  • Slide 10
  • 10 Augment hierarchical representation (cont.)
  • Slide 11
  • 11 Augment hierarchical representation (cont.) The linear ordering of is {{A+}{B+}{C+}{A}{B}{D+}{D}{C}}
  • Slide 12
  • 12 Interval-based event mining. Candidate generation Theorem. A (k+1)-pattern is a candidate pattern if it is generated from a frequent k- pattern and a 2-pattern where the 2-pattern occurs in at least k 1 frequent k-patterns. Dominant event Dominant event in the pattern P if it occurs in P and has the latest end time among all the events in P.
  • Slide 13
  • 13 Interval-based event mining (cont.)
  • Slide 14
  • 14 Interval-based event mining (cont.) Support count
  • Slide 15
  • 15 IEClassifier. Class labels C i 1 i c, c is the number of class label The information gain: p(TP) is probability of pattern TP to occur in datasets. Whose information gain values are below a predefined info_gain threshold are removed.
  • Slide 16
  • 16 IEClassifier. (cont) Let PatternMatch I be the set of discriminating patterns that are contained in I
  • Slide 17
  • 17 Experiment.
  • Slide 18
  • 18 Experiment. (cont) Nearest Neighbor (Neural Networks) Decision Tree SVM Hyper-plan Hyper- plan
  • Slide 19
  • 19 Conclusion. IEMiner algorithm IEClassification The performance improved It achieved the best accuracy