Significance: gives for the first time exact inference results in closed-form

1
Significance: gives for the first time exact inference results in closed-form Efficiency is cubic in the number of variables Derived Stable-Jacobi approximate inference algorithm. Significance: when converging, converges to the exact result, while typically more efficient We analyze its convergence and give two sufficient conditions for convergence. Detection: given the channel transformation A, observation vector y, and the stable parameters of the noise z, compute the most probable transmission x Sample CDMA problem setup borrowed from [Yener-Tran-Comm.-2002] Exact inference: more accurate detection than methods designed for the AWGN (additive white Gaussian noise channel) Approximate inference: converges, as predicted to the exact conditional posterior marginals First time exact inference in linear-stable model Faster, more accurate, reduces memory consumption and conveniently computed in closed-form Future work: Investigate other families of distributions like Wishart and geometric stable distributions Other transforms Closed to scalar multiplication We use the linear model Y=AX+Z X,Z are i.i.d. hidden variables drawn from a stable distribution, Y are the observations Inference is computed by The problem: stable distribution has no closed-form cdf nor pdf (thus Copulas or CFG can not be used) Solution: perform inference in the characteristic function (Fourier) domain Inference with Heavy-Tails in Linear Models Danny Bickson and Carlos Guestrin Network flows are linear Total flow at a node composed of sums of distinct flows The challenge: how to model heavy tailed network traffic? Motivation: Large Scale Network modeling Huge amounts of data. Daily stats collected from the PlanetLab network using PlanetFlow: 662 PlanetLab nodes spread over the world 19,096,954,897 packets were transmitted 10,410,216,514,054 bytes where transmitted 24,012,123 unique IP addresses observed Bandwidth distribution is heavy tailed: 1% of the top flows are 19% of the total traffic Bandwidth/port number distribution is heavy tailed Heavy-tailed traffic distribution Use linear multivariate statistical methods for network modeling, monitoring, performance analysis and intrusion detection. • Typically can not be computed in closed-form. Various approximations: Mixtures of distributions [Chen-Infocom07] , Histograms [Lakhina-Sigcomm05], Sketches [Li-IMC06], Entropy [Lakhina-Sigcomm05], Sampled moments [Nguyen- IMC07], Etc. Previous approaches for computing inference in heavy-tailed linear models Output: Posterior marginal Exact inference Input: Prior marginal Quantizatio n Fitting Resampling NBP output Main contribution First to compute exact inference in linear-stable model conveniently in closed-form. Efficient iterative approximate inference. Our solution is: • More efficient • More accurate • Requires less memory/ storage Stable distribution Character istic exponent Skew Scale Shift ) , , , ( S A family of heavy tailed distributions. Used in different problem domains: economics, physics, geology etc. Example: Cauchy, Gaussian and Levy distributions are stable. Closed to addition Related work on linear models: Convolutional factor graphs (CFG) – [Mao-Tran-Info- Theory-03]. Assumes pdf factorizes as a convolution of factors (shows this is possible for any linear model) Copula method – handles linear model in the cdf domain Independent components analysis (ICA) - learns linear models and tries to reconstruct X. Can be used as a complimentary method, since we assume that A is given. Non-parametric BP (NBP) [Sudderth-CVPR03] Linear characteristic graphical models (LCM) Given a linear model, we define LCM as the product of the joint characteristic functions for the probability distribution Motivation: LCM is the dual model to the convolution representation of the linear model Unlike CFG, LCM is always defined, for any distribution CFG shows that Any linear model can be represented as a convolution Linearity of stable distribution Modeling network flows using stable distributions Our goal is to compute the posterior marginal p(x|y) Because stable distribution have no closed-form pdf, we have to compute marginalization in the Fourier domain. The dual operation to marginalization is slicing. The projection-slice theorems allows us to compute inference in the Fourier domain: Inference in the Fourier domain Slicing operation 2D Fourier transform Marginalization Difficult! Our goal Inverse Fourier Posterior marginal 2D Characteristic function LCM-Elimination: Exact inference algorithm for a general linear model Variable elimination algorithm in the Fourier domain Borrows ideas from belief propagation to compute approximate inference in the Fourier domain Uses distributivity of the slice and product operations Algorithm is exact on trees Exact inference in LCM Main result 1: exact inference in LCM with stable distributions Main result 2: approximate inference in LCM with stable distributions Approximate inference in LCM Application: network monitoring We model PlanetLab networks flows using a LCM with stable distributions. Extracted traffic flows from 25 Jan 2010: Total of 247,192,372 flows (non-zero entries of the matrix A) Fitted flows for each node (vector b) total of 16,741,746 unique nodes Computing the posterior marginal p(x|y) Cost of elimination is too high O(16M^3) Solution: USE Stable Jacobi with GRAPHLAB! Stable-Jacobi approximate inference algorithm Speedup Accuracy Marginal characteristic function Acknowledgements This research was supported by: •ARO MURI W911NF0710287 •ARO MURI W911NF0810242 •NSF Mundo IIS-0803333 •NSF Nets-NBD CNS-0721591. Application: multiuser detection Lower BER (bit error rate) is better Conclusion Number of packets is heavy tailed [Lakhina– Sigcomm 2005] Number of packets Slicing operati on Running time Difficulties in previous approximations

description

Inference with Heavy-Tails in Linear Models. Danny Bickson and Carlos Guestrin. Motivation: Large Scale Network modeling. Inference in the Fourier domain. Application: multiuser detection. Main result 1: exact inference in LCM with stable distributions. Huge amounts of data. - PowerPoint PPT Presentation

Transcript of Significance: gives for the first time exact inference results in closed-form

Page 1: Significance: gives for the  first time  exact inference results in closed-form

• Significance: gives for the first time exact inference results in closed-form

• Efficiency is cubic in the number of variables

• Derived Stable-Jacobi approximate inference algorithm.• Significance: when converging, converges to the exact result, while typically more efficient• We analyze its convergence and give two sufficient conditions for

convergence.

• Detection: given the channel transformation A, observation vector y, and the stable parameters of the noise z, compute the most probable transmission x

• Sample CDMA problem setup borrowed from [Yener-Tran-Comm.-2002]

• Exact inference: more accurate detection than methods designed for the AWGN (additive white Gaussian noise channel)

• Approximate inference: converges, as predicted to the exact conditional posterior marginals

• First time exact inference in linear-stable model

• Faster, more accurate, reduces memory consumption and conveniently computed in closed-form

• Future work: • Investigate other families of distributions like Wishart and geometric stable

distributions• Other transforms

Closed to scalar multiplication

• We use the linear model Y=AX+Z • X,Z are i.i.d. hidden variables drawn from a stable distribution,

Y are the observations

• Inference is computed by

• The problem: stable distribution has no closed-form cdf nor pdf (thus Copulas or CFG can not be used)

• Solution: perform inference in the characteristic function (Fourier) domain

Inference with Heavy-Tails in Linear ModelsDanny Bickson and Carlos Guestrin

• Network flows are linear• Total flow at a node composed of sums of distinct flows

• The challenge: how to model heavy tailed network traffic?

Motivation: Large Scale Network modeling• Huge amounts of data.• Daily stats collected from the PlanetLab network using

PlanetFlow:• 662 PlanetLab nodes spread over the world• 19,096,954,897 packets were transmitted • 10,410,216,514,054 bytes where transmitted• 24,012,123 unique IP addresses observed

Bandwidth distribution is heavy tailed: 1% of the top flows are 19% of the total traffic

Bandwidth/port number distribution is heavy tailed

Heavy-tailed traffic distribution

• Use linear multivariate statistical methods for network modeling, monitoring, performance analysis and intrusion detection.

• Typically can not be computed in closed-form. Various approximations: Mixtures of distributions [Chen-Infocom07] , Histograms [Lakhina-Sigcomm05], Sketches [Li-IMC06], Entropy [Lakhina-Sigcomm05], Sampled moments [Nguyen-IMC07], Etc.

Previous approaches for computing inference in heavy-tailed linear models

Output: Posterior marginal

Exact inference

Input: Prior marginal

Quantization Fitting Resampling NBP output

Main contribution• First to compute exact inference in linear-stable model

conveniently in closed-form.• Efficient iterative approximate inference.• Our solution is:

• More efficient• More accurate• Requires less memory/ storage

Stable distribution Characteri

stic exponent

Skew Scale Shift

),,,( S

• A family of heavy tailed distributions.• Used in different problem domains: economics, physics, geology etc.• Example: Cauchy, Gaussian and Levy distributions are stable.

Closed to addition

• Related work on linear models: • Convolutional factor graphs (CFG) – [Mao-Tran-Info-Theory-03].

Assumes pdf factorizes as a convolution of factors (shows this is possible for any linear model)

• Copula method – handles linear model in the cdf domain• Independent components analysis (ICA) - learns linear models

and tries to reconstruct X. Can be used as a complimentary method, since we assume that A is given.

Non-parametric BP (NBP) [Sudderth-CVPR03]

Linear characteristic graphical models (LCM)

• Given a linear model, we define LCM as the product of the joint characteristic functions for the probability distribution

• Motivation: LCM is the dual model to the convolution representation of the linear model

• Unlike CFG, LCM is always defined, for any distribution

• CFG shows that Any linear model can be represented as a convolution

Linearity of stable distribution

Modeling network flows using stable distributions

• Our goal is to compute the posterior marginal p(x|y)

• Because stable distribution have no closed-form pdf, we have to compute marginalization in the Fourier domain.

• The dual operation to marginalization is slicing.

• The projection-slice theorems allows us to compute inference in the Fourier domain:

Inference in the Fourier domain

Slicing operation

2D Fourier transform

Marginalization Difficult!

Our goal

InverseFourier

Posterior marginal

2D Characteristic function

• LCM-Elimination: Exact inference algorithm for a general linear model

• Variable elimination algorithm in the Fourier domain

• Borrows ideas from belief propagation to compute approximate inference in the Fourier domain

• Uses distributivity of the slice and product operations

• Algorithm is exact on trees

Exact inference in LCM

Main result 1: exact inference in LCM with stable distributions

Main result 2: approximate inference in LCM with stable distributions

Approximate inference in LCM

Application: network monitoring

• We model PlanetLab networks flows using a LCM with stable distributions.• Extracted traffic flows from 25 Jan 2010: Total of 247,192,372 flows (non-zero entries of the matrix A)• Fitted flows for each node (vector b) total of 16,741,746 unique nodes

• Computing the posterior marginal p(x|y)

• Cost of elimination is too high O(16M^3)

• Solution: USE Stable Jacobi with GRAPHLAB!

Stable-Jacobi approximate inference algorithm

Speedup Accuracy

Marginal characteristic function

AcknowledgementsThis research was supported by:• ARO MURI W911NF0710287• ARO MURI W911NF0810242 • NSF Mundo IIS-0803333 • NSF Nets-NBD CNS-0721591.

Application: multiuser detection

Lower BER (bit error rate) is better

Conclusion

Number of packets is heavy tailed [Lakhina– Sigcomm 2005]

Num

ber o

f pac

kets

Slicing operation

Running time

Difficulties in previous approximations