Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei...
-
Upload
quinn-hutchings -
Category
Documents
-
view
218 -
download
3
Transcript of Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei...
![Page 1: Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at.](https://reader038.fdocuments.in/reader038/viewer/2022102923/551c1c9f550346a34f8b595d/html5/thumbnails/1.jpg)
Zhenhui Jessie Li 1
Mining Event Periodicity from Incomplete Observations
Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei HanUniversity of Illinois at Urbana-Champaign
*Now at Penn State University
KDD 2012Beijing, China
![Page 2: Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at.](https://reader038.fdocuments.in/reader038/viewer/2022102923/551c1c9f550346a34f8b595d/html5/thumbnails/2.jpg)
Zhenhui Jessie Li 2
Prologue: Detect Periodicity in Movements [Li et al., KDD’10]
Problem: What is the periodicity of the
movement?
Bee example:8 hours in hive16 hours fly nearby
![Page 3: Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at.](https://reader038.fdocuments.in/reader038/viewer/2022102923/551c1c9f550346a34f8b595d/html5/thumbnails/3.jpg)
Zhenhui Jessie Li 3
Prologue: Detect Periodicity in Movements [Li et al., KDD’10]
Observe the in-and-out movements from the reference spot (i.e., hive).
in hive
outside hive
time
Two-Dimensional Movement One-Dimensional Binary Sequence
Easy to see the
periodicity.
![Page 4: Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at.](https://reader038.fdocuments.in/reader038/viewer/2022102923/551c1c9f550346a34f8b595d/html5/thumbnails/4.jpg)
Zhenhui Jessie Li 4
Challenge: Periodicity Detection for Incomplete Observations
• Two factors result in incomplete observations: inconsistent + low sampling rate
• Movement data collection in real scenarios:– Human movements data collected from cellphones: only report
locations when making calls– Animal movement data: 2~3 locations in 3~5 days
2009-05-02 01:03 in2009-05-03 11:30 out2009-05-05 03:12 in2009-05-09 12:03 in2009-05-10 11:14 out2009-05-11 02:15 in…
in hive
outside hive
Complete Observations Incomplete Observations
![Page 5: Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at.](https://reader038.fdocuments.in/reader038/viewer/2022102923/551c1c9f550346a34f8b595d/html5/thumbnails/5.jpg)
Zhenhui Jessie Li 5
A Challenging Case of Detecting Periodicity for Incomplete
Observations
2009-05-02 01:03 in2009-05-03 11:30 out2009-05-05 03:12 in2009-05-09 12:03 in2009-05-10 11:14 out2009-05-11 02:15 in…
Sparse Raw Data
in out in
Any periodicity in the above sequence?
![Page 6: Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at.](https://reader038.fdocuments.in/reader038/viewer/2022102923/551c1c9f550346a34f8b595d/html5/thumbnails/6.jpg)
Zhenhui Jessie Li 6
Mining Periodicity in Incomplete Data
• Event has a period of 20• Occurrences of the event happen between 20k+5 to 20k+10
![Page 7: Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at.](https://reader038.fdocuments.in/reader038/viewer/2022102923/551c1c9f550346a34f8b595d/html5/thumbnails/7.jpg)
Zhenhui Jessie Li 7
A Probabilistic Model for Periodic Event
Example:• Human daily periodicity visiting
office• Period as 24• Visiting office at 10-11am, 14-
16pm
![Page 8: Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at.](https://reader038.fdocuments.in/reader038/viewer/2022102923/551c1c9f550346a34f8b595d/html5/thumbnails/8.jpg)
Zhenhui Jessie Li 8
A Probabilistic Model for Periodic Event with Random Observation
generate
x(5)=1 x(62)=0
![Page 9: Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at.](https://reader038.fdocuments.in/reader038/viewer/2022102923/551c1c9f550346a34f8b595d/html5/thumbnails/9.jpg)
Zhenhui Jessie Li 9
Periodicity Detection by Overlaying Observations
Skewed distribution
Even distribution
True period Wrong period
![Page 10: Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at.](https://reader038.fdocuments.in/reader038/viewer/2022102923/551c1c9f550346a34f8b595d/html5/thumbnails/10.jpg)
Zhenhui Jessie Li 10
Relationship between Observation Ratio and Probabilistic Model
Pos/Neg Ratio Periodic Distribution Vector
![Page 11: Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at.](https://reader038.fdocuments.in/reader038/viewer/2022102923/551c1c9f550346a34f8b595d/html5/thumbnails/11.jpg)
Zhenhui Jessie Li 11
Discrepancy Score to Measure Periodicity
If T (=24) is the correct period, the discrepancy score should be large for certain set of timestamps
If T (=23) is the wrong period, the discrepancy scores are likely to be zero for any set of timestamps
![Page 12: Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at.](https://reader038.fdocuments.in/reader038/viewer/2022102923/551c1c9f550346a34f8b595d/html5/thumbnails/12.jpg)
Zhenhui Jessie Li 12
Periodicity Measure
![Page 13: Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at.](https://reader038.fdocuments.in/reader038/viewer/2022102923/551c1c9f550346a34f8b595d/html5/thumbnails/13.jpg)
Zhenhui Jessie Li 13
Performance Comparisons
Sampling rate(Ratio of observed points in the complete sequence)
![Page 14: Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at.](https://reader038.fdocuments.in/reader038/viewer/2022102923/551c1c9f550346a34f8b595d/html5/thumbnails/14.jpg)
Zhenhui Jessie Li 14
Experiment on Real Human Data
One person’s visits to a specific location
Sampling rate: 20min
Sampling rate: 1hour
![Page 15: Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at.](https://reader038.fdocuments.in/reader038/viewer/2022102923/551c1c9f550346a34f8b595d/html5/thumbnails/15.jpg)
Zhenhui Jessie Li 15
Problems with Using Fourier Transform to Detect Periodicity
T=4
T=16
![Page 16: Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at.](https://reader038.fdocuments.in/reader038/viewer/2022102923/551c1c9f550346a34f8b595d/html5/thumbnails/16.jpg)
Zhenhui Jessie Li 16
Summary: Mining Event Periodicity from Incomplete Observations
• Motivation– Challenge of the real data: incomplete
observations (inconsistent + low sampling rate)
• Method– Overlay the segments and measure the
“skewness” of the distribution– Theoretically prove the correctness of the method
• Application– Location prediction– 2nd place in Nokia Mobile Data Challenge 2012– Periodicity-based feature + SVM
Thanks! Questions?