Change Detection in Data Streams by Testing Exchangeability
description
Transcript of Change Detection in Data Streams by Testing Exchangeability
Change Detection in Data Streams by Testing Exchangeability
Shen-Shyang HoJPL/Caltech
The research is part of the author’s PhD dissertation (in computer science) at George Mason UniversityConference travel is partially sponsored by NASA Postdoctoral Program (NPP) Travel Grant.
04/21/23 1
Outline1. Introduction2. Previous Work (Statistics and Machine Learning/Data Mining/Computer Vision)3. Intuition4. Background (Exchangeability/Martingale)5. Methodology6. Comparison and Experimental Results7. Application I: Adaptive Support Vector Machine (Classification Model)8. Application II: Video Shot Change Detection (Cluster Model)
04/21/23 2
Introduction
04/21/23 3
nmm
n
withmH
H
111
0210
:
:
Let be a sequence of independent p-dimensional random vectors with parameters Test the following hypothesis:
nXXX ,,, 21 .,,, 21 n
Assumption: Data vectors are observed sequentially.
Introduction
04/21/23 4
Previous WorkStatistics :- Sequential Analysis is statistical inference with the assumption that the number of observations/samples required is not pre-determined.
• Sequential Probability Ratio Test – A. Wald (1945)• Application: Quality Control (Military/Manufacturing)• CUSUM (Cumulative Sum) – E. S. Page (1954)• Refer to “Sequential Analysis: Design Methods and Applications” Journal for recent research.• Most recent issue (vol 27, no 2, 2008) – papers on structural change/minimax method for change-point detection problems/multidecision quickest change-point detection – 3 out of 6 papers.
Machine Learning/Data Mining:• Applications: Concept Drift Problem, Adaptive classifier, Anomaly in Internet Traffic, Video-shot change detection• Proposed methodology is usually problem-specific• Monitoring error, sliding window, weighted data, ensemble classifier …• Statistical method: Likelihood ratio method, Bayesian methods, Hypothesis Testing …
04/21/23 5
Related Data Mining/Machine Learning/Computer Vision Research
04/21/23 6
1. Xiuyao Song, Mingxi Wu, Christopher M. Jermaine, Sanjay Ranka: Statistical change detection for multi-dimensional data. KDD 2007: 667-676
2. Kolter, J.Z. and Maloof, M.A. Dynamic Weighted Majority: An ensemble method for drifting concepts. Journal of Machine Learning Research 8:2755--2790, 2007.
3. Klinkenberg, Ralf and Joachims, Thorsten: Detecting Concept Drift with Support Vector Machines. Proceedings of the Seventeenth International Conference on Machine Learning (ICML): 487--494, 2000.
4. Bi Song, Namrata Vaswani, Amit K. Roy Chowdhury: Closed-Loop Tracking and Change Detection in Multi-Activity Sequences. CVPR 2007
5. Paul L. Rosin: Thresholding for Change Detection. ICCV 1998: 274-279 6. Balachander Krishnamurthy, Subhabrata Sen, Yin Zhang, Yan Chen: Sketch-
based change detection: methods, evaluation, and applications. Internet Measurement Conference 2003: 234-247
7. Tsuyoshi Idé, Keisuke Inoue: Knowledge Discovery from Heterogeneous Dynamic Systems using Change-Point Correlations. SDM 2005
8. Tsuyoshi Idé, Koji Tsuda: Change-Point Detection using Krylov Subspace Learning. SDM 2007
9. Daniel Kifer, Shai Ben-David, Johannes Gehrke, Detecting Changes in Data Streams, Proc. 30th VLDB Conference, 2004.
10. ... …
Motivation
04/21/23 7
“Lack of Exchangeability” implies “Change in Data Distribution/Model”
Intuition
1 2 3 4 5 6 7 8 9 10
1 9 3 5 2 6 7 2 8 10
Identically Distributed but may be Dependent
04/21/23 8
1 2 3 4 5 6 7 8 9 10
1 9 3 5 2 6 7 4 8 10
Background
04/21/23 9
Vovk et al’s work on “Testing Exchangeability Online” (ICML 2003) and “Algorithmic Learning in a random world” (Springer) : -
1.Testing exchangeability assumption in an online mode.
2.Explicit Martingale for testing the hypothesis of exchangeability
(Refer to http://www.vovk.net (conformal prediction) )
Background
04/21/23 10
Let be a sequence of random variables. A finite sequence of random variable is exchangeable if , the joint distribution is invariant under any permutation of the indices of the random variables.
A martingale is a sequence of random variables such that is a measurable function of for all (in particular, is a constant value) and the conditional expectation of given is equal to , i.e.,
}1:{ iZinZZ ,,1 ),,( 1 nZZp
}1:{ iM i
nM nZZ ,,1 ,1,0n0M 1nM
nMnZZ ,,1 .),,|( 11 nnn MZZME
Background
04/21/23 11
).()max(
,0
.}0:{
)'(
0nk
nk
i
MEMP
nandanyforThen
martingalenegativenonaisiMthatSuppose
InequalityMaximalsDoob
Methodology - Strangeness
04/21/23 12
Strangeness measures how well one data point (for each data point seen so far) is represented by a data model compared to other points
• Applicable to classification, regression or cluster model
• measure diversity / disagreements, i.e. the higher the strangeness of a point, the less likely it comes from the model
Condition for a valid strangeness measure: A strangeness value of a data point at a particular time instance should be independent of the order it is observed with respect to the other data points.
Classification Model
04/21/23
t
aaaaa…aaaaabbbbbb…….bbbbbccccc…cccccc
t = 1 to 1000 1001 to 2000 2001 to 3000
A B C
Strangeness (K-NN):
k
j
yij
k
j
yij
i
d
d
1
1
Strangeness (SVM): Lagrange Multiplier
Classification Model
04/21/23
Strangeness (SVM): Lagrange Multiplier
Cluster Model
04/21/23 15
Strangeness of a data vector in a cluster
.
||||
clustertheofcentertheisCwhere
Czii iz
Regression Model
04/21/23 16
))(exp(
|)(|),(
i
iiiii xg
xfyyx
where is the regression function and is the error estimation function for at f g ixf
(Papadopoulos et al., Inductive Confidence Machines for Regression, ECML, LNAI 2430, pp 345-356, 2002)
Methodology
04/21/23 17
where is the strangeness measure for and is randomly chosen from [0,1] for each new point
: necessary so the sequence of p-values are uniformly distributed in [0,1] for any strangeness measure (Vovk, 2003)
)(1
}:{#}:{#)},,,,({ 111
1121 Bn
iixxxPV ninni
nn
p-value of a new point given previous seen data points: 1nx
i1n .1nx
1,,2,1, nixi
1n
Methodology
04/21/23 18
Methodology
04/21/23 19
Consider the null hypothesis
against the alternative hypothesis
The test for change continues as long as
One rejects the null hypothesis when
iM0
iM
streamdatainoccurschangeaH :1
streamdatainchangenoH :0
Methodology
04/21/23 20
ntimeatvaluesparenipand
innumberpositivefixedaiswhere
pM
i
n
iin
,,1,,1,
)1,0(
)(1
1)(
Methodology
04/21/23 21
Experimental Result – Performance Measure
04/21/23 22
Experimental Result – Varying
04/21/23 23
Experimental Result – Varying Strangeness
04/21/23 24
Experimental Result –Varying
04/21/23 25Linearly Separable Classification Model Linearly Non-separable Classification Model
Experimental Result
04/21/23 26
Ringnorm/Twonorm (Change in dataset every 1000 points)
Nursery Categorical Dataset (Change in class compositions every 1000 points)
Experimental Result
04/21/23 27
Experimental Result – Different Methods
04/21/23 28
.
210
3200
2122212
12211
noiseGaussian
arenandnwherenxY
tonxxYfromtimeatChanges
Application: Adaptive SVM
04/21/23 29
Application: Adaptive SVM
Simulated USPS 3-Digit Image Data Stream
t01120120…0340033404…156556115…77789987…
04/21/23 30
Application: Adaptive SVM
A (blue): True Change Point Known to the SVMB(red): Adaptive SVM using martingale methodC(magenta): SVM using sliding window of size 250D(black): SVM using sliding window of size 500E(green): SVM using sliding window of size 1000
04/21/23 31
Application: Video-Shot Change Detection
04/21/23 32
Martingale Change Detection using multiple features (MVMT: Multiple-view martingale test)
Application: Video-Shot Change Detection
04/21/23 33
• HI: Histogram Intersection • Chi-Square Measure • Euclidean Distance (ED)
Reference
04/21/23 34
1. S.-S. Ho and H. Wechsler, Detecting Change-Points in Unlabeled Data Streams using Martingale, Proc. 20th Int. Joint. Conf. Artificial Intelligence (IJCAI 2007), Hyderabad, India, Jan. 6 - 12, 2007.
2. S-S Ho, A Martingale Framework for Concept Change Detection in Time-Varying Data Streams, Proc Int. Conf. on Machine Learning (ICML 2005), Bonn, Germany, Aug. 7 - 11, 2005
3. S-S Ho and H. Wechsler, Adaptive Support Vector Machine for Time-Varying Data streams Using the Martingale, Proc. Int. Joint Conf. on Artificial Intelligence (IJCAI 2005), Edinburgh, Scotland, July 30 - Aug. 5, 2005
4. S-S Ho and H. Wechsler, On the detection of concept change in time-varying data streams by testing exchangeability, Proc. Conference on Uncertainty in Artificial Intelligence (UAI 2005), Edinburgh, Scotland, July 26 - 29, 2005
5. http://shenshyang.googlepages.com/codes (matlab codes + datasets)
Acknowledgement
04/21/23 35
• Harry Wechsler, PhD Advisor (George Mason University)• Volodya Vovk, (Royal Holloway, University of London)• Alexander Gammerman (Royal Holloway, University of London)• Oak Ridge Associated University (ORAU)