2013-SIGMETRICS-heavytails
-
Upload
farrukh-javed -
Category
Documents
-
view
6 -
download
0
description
Transcript of 2013-SIGMETRICS-heavytails
![Page 1: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/1.jpg)
The Fundamentals of Heavy TailsProperties, Emergence, & Identification
Jayakrishnan Nair, Adam Wierman, Bert Zwart
![Page 2: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/2.jpg)
Why am I doing a tutorial on heavy tails?
Why are we writing a book on the topic?
Because heavy-tailed phenomena are everywhere!
Because we’re writing a book on the topic…
![Page 3: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/3.jpg)
Why am I doing a tutorial on heavy tails?
Because we’re writing a book on the topic…
Why are we writing a book on the topic?
Because heavy-tailed phenomena are everywhere! BUT, they are extremely misunderstood.
![Page 4: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/4.jpg)
Heavy-tailed phenomena are treated as something
Mysterious, Surprising, & Controversial
![Page 5: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/5.jpg)
Heavy-tailed phenomena are treated as something
Mysterious, Surprising, & Controversial
Our intuition is flawed because intro probability classes focus on light-tailed distributions
Simple, appealing statistical approaches have BIG problems
![Page 6: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/6.jpg)
Heavy-tailed phenomena are treated as something
Mysterious, Surprising, & Controversial
BUT…
2005, ToN
Similar stories in electricity nets, citation nets, …
![Page 7: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/7.jpg)
Heavy-tailed phenomena are treated as something
Mysterious, Surprising, & Controversial
1. Properties 2. Emergence
3. Identification
![Page 8: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/8.jpg)
What is a heavy-tailed distribution? A distribution with a “tail” that is “heavier” than an Exponential
![Page 9: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/9.jpg)
What is a heavy-tailed distribution? A distribution with a “tail” that is “heavier” than an Exponential
Exponential: Pr(>)
![Page 10: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/10.jpg)
What is a heavy-tailed distribution?
Canonical Example: The Pareto Distribution a.k.a. the “power-law” distribution
A distribution with a “tail” that is “heavier” than an Exponential
Pr > = ( ) =density: ( ) = for ≥
Exponential: Pr(>)
![Page 11: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/11.jpg)
What is a heavy-tailed distribution? A distribution with a “tail” that is “heavier” than an Exponential
Many other examples: LogNormal, Weibull, Zipf, Cauchy, Student’s t, Frechet, …
: log ∼ = /Canonical Example: The Pareto Distribution a.k.a. the “power-law” distribution
Exponential: Pr(>)
![Page 12: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/12.jpg)
What is a heavy-tailed distribution? A distribution with a “tail” that is “heavier” than an Exponential
Many other examples: LogNormal, Weibull, Zipf, Cauchy, Student’s t, Frechet, …Canonical Example: The Pareto Distribution a.k.a. the “power-law” distribution
Many subclasses: Regularly varying, Subexponential, Long-tailed, Fat-tailed, …
Exponential: Pr(>)
![Page 13: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/13.jpg)
Heavy-tailed distributions have many strange & beautiful properties
• The “Pareto principle”: 80% of the wealth owned by 20% of the population, etc.• Infinite variance or even infinite mean• Events that are much larger than the mean happen “frequently”….
These are driven by 3 “defining” properties1) Scale invariance2) The “catastrophe principle”3) The residual life ”blows up”
![Page 14: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/14.jpg)
Scale invariance
![Page 15: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/15.jpg)
Scale invarianceis scale invariant if there exists an and a such that = ( ) for all , such that ≥ .
“change of scale”
![Page 16: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/16.jpg)
Scale invarianceis scale invariant if there exists an and a such that = ( ) for all , such that ≥ .
Example: Pareto distributions= = 1Theorem: A distribution is scale invariant if and only if it is Pareto.
![Page 17: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/17.jpg)
Scale invarianceis scale invariant if there exists an and a such that = ( ) for all , such that ≥ .
is asymptotically scale invariant if there exists a continuous, finite such that lim→ ( ) = for all .
Asymptotic scale invariance
![Page 18: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/18.jpg)
Example: Regularly varying distributionsis regularly varying if = ( ), where ( ) is slowly varying,
i.e., lim→ ( )( ) = 1 for all > 0.
Theorem: A distribution is asymptotically scale invariant iff it is regularly varying.
Asymptotic scale invarianceis asymptotically scale invariant if there exists a continuous, finite such that lim→ ( ) = for all .
![Page 19: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/19.jpg)
Example: Regularly varying distributionsis regularly varying if = ( ), where ( ) is slowly varying,
i.e., lim→ ( )( ) = 1 for all > 0.
Regularly varying distributions are extremely useful. They basically behave like Pareto distributions with respect to the tail: “Karamata” theorems “Tauberian” theorems
![Page 20: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/20.jpg)
Heavy-tailed distributions have many strange & beautiful properties
• The “Pareto principle”: 80% of the wealth owned by 20% of the population, etc.• Infinite variance or even infinite mean• Events that are much larger than the mean happen “frequently”….
These are driven by 3 “defining” properties1) Scale invariance2) The “catastrophe principle”3) The residual life ”blows up”
![Page 21: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/21.jpg)
A thought experimentDuring lecture I polled my 50 students about their heights and the number of twitter followers they have…
The sum of the heights was ~300 feet.The sum of the number of twitter followers was 1,025,000
What led to these large values?
![Page 22: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/22.jpg)
A thought experimentDuring lecture I polled my 50 students about their heights and the number of twitter followers they have…
The sum of the heights was ~300 feet.The sum of the number of twitter followers was 1,025,000
A bunch of people were probably just over 6’ tall(Maybe the basketball teams were in the class.)
One person was probably a twitter celebrity and had ~1 million followers.
“Catastrophe principle”
“Conspiracy principle”
![Page 23: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/23.jpg)
ExampleConsider + i.i.d Weibull.Given + = , what is the marginal density of ?
Light-tailed Weibull
Heavy-tailed WeibullExponential
()
“Catastrophe principle”
“Conspiracy principle”
0
![Page 24: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/24.jpg)
![Page 25: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/25.jpg)
Extremely useful for random walks, queues, etc.
“Principle of a single big jump”
![Page 26: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/26.jpg)
Subexponential distributionsis subexponential if for i.i.d. , Pr +⋯+ > ∼ ( > )
![Page 27: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/27.jpg)
Subexponential distributionsis subexponential if for i.i.d. , Pr +⋯+ > ∼ ( > )
Pareto
Subexponential
Weibull
LogNormal
Regularly Varying
Heavy-tailed
![Page 28: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/28.jpg)
Heavy-tailed distributions have many strange & beautiful properties
• The “Pareto principle”: 80% of the wealth owned by 20% of the population, etc.• Infinite variance or even infinite mean• Events that are much larger than the mean happen “frequently”….
These are driven by 3 “defining” properties1) Scale invariance2) The “catastrophe principle”3) The residual life ”blows up”
![Page 29: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/29.jpg)
A thought experimentWhat happens to the expected remaining waiting time as we wait
…for a table at a restaurant?…for a bus?…for the response to an email?
residual life
If you don’t get it quickly, you never will…
The remaining wait drops as you wait
![Page 30: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/30.jpg)
The distribution of residual lifeThe distribution of remaining waiting time given you have
already waited time is = ( )( ) .
Examples:
Exponential : = ( ) = “memoryless”
Pareto : = = 1 + Increasing in x
![Page 31: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/31.jpg)
The distribution of residual lifeThe distribution of remaining waiting time given you have
already waited time is = ( )( ) .
Mean residual life
Hazard rate
= − > = ∫= = (0)
![Page 32: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/32.jpg)
BUT: not all heavy-tailed distributions have DHR / IMRLsome light-tailed distributions are DHR / IMRL
What happens to the expected remaining waiting time as we wait…for a table at a restaurant?…for a bus?…for the response to an email?
![Page 33: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/33.jpg)
BUT: not all heavy-tailed distributions have DHR / IMRLsome light-tailed distributions are DHR / IMRL
Long-tailed distributionsis long-tailed if lim→ = lim→ ( )( ) = 1 for all
![Page 34: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/34.jpg)
Long-tailed distributions
Pareto
Subexponential
Weibull
LogNormal
Regularly Varying
Heavy-tailed
Long-tailed distributions
is long-tailed if lim→ = lim→ ( )( ) = 1 for all
![Page 35: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/35.jpg)
Pareto
Subexponential
Weibull
LogNormal
Regularly Varying
Heavy-tailed
Long-tailed distributions
Difficult to work with in general
![Page 36: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/36.jpg)
Pareto
Subexponential
Weibull
LogNormal
Regularly Varying
Heavy-tailed
Long-tailed distributions
Asymptotically scale invariant Useful analytic properties
Difficult to work with in general
![Page 37: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/37.jpg)
Pareto
Subexponential
Weibull
LogNormal
Regularly Varying
Heavy-tailed
Long-tailed distributions
Catastrophe principle Useful for studying random walks
Difficult to work with in general
![Page 38: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/38.jpg)
Pareto
Subexponential
Weibull
LogNormal
Regularly Varying
Heavy-tailed
Long-tailed distributions
Residual life “blows up” Useful for studying extremes
Difficult to work with in general
![Page 39: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/39.jpg)
Heavy-tailed phenomena are treated as something
Mysterious, Surprising, & Controversial
1. Properties 2. Emergence
3. Identification
![Page 40: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/40.jpg)
We’ve all been taught that the Normal is “normal”…because of the Central Limit Theorem
But the Central Limit Theorem we’re taught is not complete!
![Page 41: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/41.jpg)
Law of Large Numbers (LLN): ∑ → . . when < ∞∑ = + ( )A quick reviewConsider i.i.d. . How does ∑ grow?
= 15 = 300
∑
![Page 42: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/42.jpg)
A quick reviewConsider i.i.d. . How does ∑ grow?
Central Limit Theorem (CLT):when Var = < ∞.1 − → ~ (0, )∑ = + + ( )
∑∑−[
]
![Page 43: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/43.jpg)
A quick reviewConsider i.i.d. . How does ∑ grow?
Central Limit Theorem (CLT):when Var = < ∞.1 − → ~ (0, )∑ = + + ( )
Two key assumptions
∑∑−[
]
![Page 44: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/44.jpg)
A quick reviewConsider i.i.d. . How does ∑ grow?
Central Limit Theorem (CLT):when Var = < ∞.1 − → ~ (0, )∑ = + + ( )
What if
∑∑−[
]
![Page 45: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/45.jpg)
A quick reviewConsider i.i.d. . How does ∑ grow?
Central Limit Theorem (CLT):when Var = < ∞.1 − → ~ (0, )∑ = + + ( )= 300
What if
∑−[
]
∑
![Page 46: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/46.jpg)
A quick reviewConsider i.i.d. . How does ∑ grow?
Central Limit Theorem (CLT):when Var = < ∞.1 − → ~ (0, )
What if
The Generalized Central Limit Theorem (GCLT):1 − → (0, ) ∈ (0,2)∑ = + / + ( / )
![Page 47: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/47.jpg)
A quick reviewConsider i.i.d. . How does ∑ grow?
Central Limit Theorem (CLT):when Var = < ∞.1 − → ~ (0, )
What if
The Generalized Central Limit Theorem (GCLT):1 − → (0, ) ∈ (0,2)
![Page 48: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/48.jpg)
Consider i.i.d. . How does ∑ grow?
Returning to our original question…
Either the Normal distribution ORa power-law distribution can emerge!
![Page 49: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/49.jpg)
Consider i.i.d. . How does ∑ grow?
Returning to our original question…
Either the Normal distribution ORa power-law distribution can emerge!
…but this isn’t the only question one can ask about ∑ .
What is the distribution of the“ruin” time?
The ruin time is always heavy-tailed!
![Page 50: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/50.jpg)
What is the distribution of the“ruin” time?
The ruin time is always heavy-tailed!
Consider a symmetric 1-D random walk
1/2
1/2
The distribution of ruin time satisfies Pr > ∼ /
![Page 51: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/51.jpg)
We’ve all been taught that the Normal is “normal”…because of the Central Limit Theorem
Heavy-tails are more “normal” than the Normal!1. Additive Processes2. Multiplicative Processes3. Extremal Processes
![Page 52: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/52.jpg)
A simple multiplicative process= ⋅ ⋅ … ⋅ ,where are i.i.d. and positive
Ex: incomes, populations, fragmentation, twitter popularity…
“Rich get richer”
![Page 53: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/53.jpg)
Multiplicative processes almost always lead to heavy tails
An example:, ∼ ( )Pr ⋅ > ≥ Pr >=⇒ ⋅ is heavy-tailed!
![Page 54: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/54.jpg)
= ⋅ ⋅ … ⋅Multiplicative processes almost always lead to heavy tails
log = log + log +⋯+ logCentral Limit Theoremlog = + + ,where ∼ (0, )
when Var = < ∞.⋅ ⋅ … ⋅ / → ∼ (0, )where = [ ]and Var log = < ∞.
![Page 55: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/55.jpg)
⋅ ⋅ … ⋅ / → ∼ (0, )where = [ ]and Var log = < ∞.
Multiplicative central limit theorem /
![Page 56: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/56.jpg)
⋅ ⋅ … ⋅ / → ∼ (0, )where = [ ]and Var log = < ∞.
Satisfied by all distributions with finite mean and many with infinite mean.
Multiplicative central limit theorem
![Page 57: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/57.jpg)
“Rich get richer”
LogNormals emergeHeavy-tails
A simple multiplicative process
Ex: incomes, populations, fragmentation, twitter popularity…
= ⋅ ⋅ … ⋅ ,where are i.i.d. and positive
![Page 58: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/58.jpg)
A simple multiplicative process
Ex: incomes, populations, fragmentation, twitter popularity…
Multiplicative process with a lower barrier
Multiplicative process with noise= += min( , ) Distributions that are
approximately power-law emerge
= ⋅ ⋅ … ⋅ ,where are i.i.d. and positive
![Page 59: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/59.jpg)
A simple multiplicative process
Ex: incomes, populations, fragmentation, twitter popularity…
Multiplicative process with a lower barrier= min( , )
= ⋅ ⋅ … ⋅ ,where are i.i.d. and positive
Under minor technical conditions, → such thatlim→ ( ) = ∗ where ∗ = sup( ≥ 0| ≤ 1)“Nearly” regularly varying
![Page 60: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/60.jpg)
We’ve all been taught that the Normal is “normal”…because of the Central Limit Theorem
Heavy-tails are more “normal” than the Normal!1. Additive Processes2. Multiplicative Processes3. Extremal Processes
![Page 61: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/61.jpg)
A simple extremal process= max( , , … , )Ex: engineering for floods, earthquakes, etc. Progression of world records
“Extreme value theory”
![Page 62: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/62.jpg)
A simple example∼ ( )Pr max ,… , > + = +How does scale?
= 1 − e
= max( , , … , ) −
→ := 1 − e= 1, = logGumbel distribution
![Page 63: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/63.jpg)
How does scale? = max( , , … , ) −
“Extremal Central Limit Theorem”− → Heavy-tailedHeavy or light-tailedLight-tailed
![Page 64: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/64.jpg)
How does scale? = max( , , … , ) −
“Extremal Central Limit Theorem”iff are regularly varyinge.g. when are Uniforme.g. when are LogNormal
− →
![Page 65: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/65.jpg)
A simple extremal process= max( , , … , )Ex: engineering for floods, earthquakes, etc. Progression of world records
Either heavy-tailed or light-tailed distributions can emerge as → ∞
…but this isn’t the only question one can ask about .
What is the distribution of thetime until a new “record” is set?
The time until a record is always heavy-tailed!
![Page 66: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/66.jpg)
The time until a record is always heavy-tailed!
:Time between & + 1 recordPr > ∼ 2What is the distribution of thetime until a new “record” is set?
![Page 67: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/67.jpg)
We’ve all been taught that the Normal is “normal”…because of the Central Limit Theorem
Heavy-tails are more “normal” than the Normal!1. Additive Processes2. Multiplicative Processes3. Extremal Processes
![Page 68: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/68.jpg)
Heavy-tailed phenomena are treated as something
Mysterious, Surprising, & Controversial
1. Properties 2. Emergence
3. Identification
![Page 69: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/69.jpg)
Heavy-tailed phenomena are treated as something
Mysterious, Surprising, & Controversial
1999 Sigcomm paper – 4500+ citations!
BUT…
2005, ToN
Similar stories in electricity nets, citation nets, …
![Page 70: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/70.jpg)
A “typical” approach for identifying of heavy tails: Linear Regression
Heavy-tailedor light-tailed?
value
frequ
ency
“frequency plot”
![Page 71: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/71.jpg)
A “typical” approach for identifying of heavy tails: Linear Regression
Heavy-tailedor light-tailed?
log-linear scale
Log(
frequ
ency
)
value
![Page 72: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/72.jpg)
A “typical” approach for identifying of heavy tails: Linear Regression
Heavy-tailedor light-tailed?
= log = log −Linear ⇒Exponential tail
log-linear scale
![Page 73: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/73.jpg)
A “typical” approach for identifying of heavy tails: Linear Regression
Heavy-tailedor light-tailed? log-log scale
= log = log −Linear ⇒Exponential tail
log-linear scale
= log = log − ( + 1) logLinear ⇒Power-law tail
log-linear scale
![Page 74: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/74.jpg)
log-log scale
Regression ⇒Estimate of tail index ( )
= log = log − ( + 1) logLinear ⇒Power-law tail
![Page 75: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/75.jpg)
log-log scale
= log = log − ( + 1) logLinear ⇒Power-law tail
Regression ⇒Estimate of tail index ( )
Is it really linear?Is the estimate of accurate?
![Page 76: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/76.jpg)
log-log scale
Pr>log-log scale log-log scalelog-log scale
Pr > = =
“rank plot”
= log = log − ( + 1) log
“frequency plot”
= 1.4= 1.9True = 2.0
![Page 77: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/77.jpg)
This simple change is extremely important…
log-log scalelog-log scale log-log scalelog-log scale
“rank plot” “frequency plot”
Pr>
![Page 78: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/78.jpg)
log-log scale
This simple change is extremely important…
“frequency plot”
The data is from an Exponential!
Does this look like a power law?log-log scale
“rank plot”
Pr>
![Page 79: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/79.jpg)
log-log scale
“frequency plot”
Electricity grid degree distribution
= 3(from Science)
This mistake has happened A LOT!
“rank plot”
log-linear scale
Pr>
![Page 80: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/80.jpg)
This mistake has happened A LOT!
log-log scale
“frequency plot”
WWW degree distribution
= 1.1(from Science)
“rank plot”
log-log scale
= 1.7Pr>
![Page 81: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/81.jpg)
Linear ⇒Power-law tail
Regression ⇒Estimate of tail index ( )
But, this is still an error-prone approach
…other distributions can be nearly linear too
This simple change is extremely important…
log-log scale
Pr>log-log scale
“rank plot” …log-log scale log-log scale
WeibullLognormal
![Page 82: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/82.jpg)
log-log scale
Pr>log-log scale
Regression ⇒Estimate of tail index ( )
…assumptions of regression are not met
But, this is still an error-prone approachThis simple change is extremely important…
…tail is much noisier than the body
Linear ⇒Power-law tail…other distributions can be nearly linear too
“rank plot”
![Page 83: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/83.jpg)
A completely different approach: Maximum Likelihood Estimation (MLE)
What is the for which the data is most “likely”?; =log ( ; ) = log − log= ∑ log( / )Maximizing gives
This has many nice properties: is the minimal variance, unbiased estimator. is asymptotically efficient.
![Page 84: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/84.jpg)
A completely different approach: Maximum Likelihood Estimation (MLE)not so
= −∑ log( ̂ / )∑ log( / )asymptotically for large data sets, when weights are chosen as = 1/ (log − log ).
∼ ∑ log( / )=
Weighted Least Squares Regression (WLS)
![Page 85: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/85.jpg)
A completely different approach: Maximum Likelihood Estimation (MLE)not so
Weighted Least Squares Regression (WLS)
log-log scale
“Listen to your body”
asymptotically for large data sets, when weights are chosen as = 1/ (log − log ).
WLS, = 2.2LS, = 1.5
“rank plot”true = 2.0
![Page 86: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/86.jpg)
A quick summary of where we are:
Suppose data comes from a power-law (Pareto) distribution = .
Then, we can identify this visually with a log-log plot, and we can estimate using either MLE or WLS.
![Page 87: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/87.jpg)
log-log scale
“rank plot”
Suppose data comes from a power-law (Pareto) distribution = .
Then, we can identify this visually with a log-log plot, and we can estimate using either MLE or WLS.
What if the data is not exactly a power-law? What if only the tail is power-law?
But, where does the tail start?
Can we just use MLE/WLS on the “tail”?
Impossible to answer…
![Page 88: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/88.jpg)
An exampleSuppose we have a mixture of power laws:
…but, suppose we use as our cutoff: 1 → ( ) + (1 − )( ) ≠
Suppose we have a mixture of power laws:= + 1 − <We want → as → ∞.
![Page 89: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/89.jpg)
Identifying power-law distributions“Listen to your body”
Identifying power-law tails“Let the tail do the talking”
MLE/WLS
v.s.
Extreme value theory
![Page 90: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/90.jpg)
Returning to our exampleSuppose we have a mixture of power laws:= + 1 − <We want → as → ∞.…but, suppose we use as our cutoff: 1 → ( ) + (1 − )( )
The bias disappears as → ∞ !
![Page 91: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/91.jpg)
The idea: Improve robustness by throwing away nearly all the data!, where → ∞ as → ∞.
+ Larger ( ) ⇒Small bias- Larger ( ) ⇒ Larger variance
![Page 92: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/92.jpg)
, where → ∞ as → ∞.
The Hill Estimator, = ∑ logwhere ( )is the th largest data point
Looks almost like the MLE, butuses order th order statistic
The idea: Improve robustness by throwing away nearly all the data!
![Page 93: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/93.jpg)
The Hill Estimator
where ( )is the th largest data point …how do we choose ?
, → as → ∞ if / →0 & → ∞throw away nearly all the data,
but keep enough data for consistency
The idea: Improve robustness by throwing away nearly all the data!, where → ∞ as → ∞.
, = ∑ log Looks almost like the MLE, butuses order th order statistic
![Page 94: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/94.jpg)
The Hill Estimator
where ( )is the th largest data point …how do we choose ?
, → as → ∞ if / →0 & → ∞Throw away everything except the outliers!
The idea: Improve robustness by throwing away nearly all the data!, where → ∞ as → ∞.
, = ∑ log Looks almost like the MLE, butuses order th order statistic
![Page 95: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/95.jpg)
Choosing in practice: The Hill plot
![Page 96: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/96.jpg)
Choosing in practice: The Hill plot
2.4
2
1.6
Pareto, = 26
3
0
Exponential
![Page 97: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/97.jpg)
Choosing in practice: The Hill plot
2.4
2
1.6
Pareto, = 22.5
2
1.5
1
Mixture, with Pareto-tail, = 2
log-log scale
![Page 98: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/98.jpg)
…but the hill estimator has problems too
1.5
1
0.5 = 300= 1.4
This data is from TCP flow sizes!
= 12000= 0.92
“Hill horror plot”
![Page 99: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/99.jpg)
Identifying power-law distributions“Listen to your body”
Identifying power-law tails“Let the tail do the talking”
MLE/WLS
Hill estimator
It’s dangerous to rely on any one technique!(see our forthcoming book for other approaches)
![Page 100: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/100.jpg)
Heavy-tailed phenomena are treated as something
Mysterious, Surprising, & Controversial
1. Properties 2. Emergence
3. Identification
![Page 101: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/101.jpg)
Heavy-tailed phenomena are treated as something
Mysterious, Surprising, & Controversial
1. Properties 2. Emergence
3. Identification Pareto
Subexponential
Weibull
LogNormal
Regularly Varying
Heavy-tailed
Long-tailed distributions
![Page 102: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/102.jpg)
Heavy-tailed phenomena are treated as something
Mysterious, Surprising, & Controversial
1. Properties 2. Emergence
3. Identification
![Page 103: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/103.jpg)
Heavy-tailed phenomena are treated as something
Mysterious, Surprising, & Controversial
1. Properties 2. Emergence
3. Identification
![Page 104: 2013-SIGMETRICS-heavytails](https://reader030.fdocuments.in/reader030/viewer/2022032516/563dba63550346aa9aa52b0b/html5/thumbnails/104.jpg)
The Fundamentals of Heavy TailsProperties, Emergence, & Identification
Jayakrishnan Nair, Adam Wierman, Bert Zwart