Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06...

52
Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006

Transcript of Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06...

Page 1: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Modeling correlations and dependencies among intervals

Scott Ferson and Vladik Kreinovich

REC’06 Savannah, Georgia, 23 February 2006

Page 2: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Interval analysis

Advantages Natural for scientists and easy to explain Works wherever uncertainty comes from Works without specifying intervariable dependencies

Disadvantages Ranges can grow quickly become very wide Cannot use information about dependence

Badmouthing interval analysis?

Page 3: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Probability v. intervals

Probability theory Can handle dependence well Has an inadequate model of ignorance

LYING: saying more than you really know

Interval analysis Can handle epistemic uncertainty (ignorance) well Has an inadequate model of dependence

COWARDICE: saying less than you know

I said this in Copenhagen, and nobody objected

Page 4: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

My perspective

Elementary methods of interval analysis Low-dimensional, usually static problems Huge uncertainties Verified computing Important to be best possible

Naïve methods very easy to use

Intervals combined with probability theory Need to be able to live with probabilists

Page 5: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Dependence in probability theory

Copulas fully capture arbitrary dependence between random variables (functional, shuffles, all)

2-increasing functions onto [0,1], with four edges fixed

Perfect Independent

01

1

uv

M(u,v) = min(u,v)0

11

uv

(u,v) = uv0

11

uv

W(u,v) = max(u+v1,0)

Opposite

Page 6: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Dependence in the bivariate case

Any restriction on the possible pairings between inputs (any subset of the units square) May also require each value of u to match with at least v, and vice versa

A little simpler than a copula

The null restriction is the full unit square Call this “nondependence” rather than independence

D denotes the set of all possible dependencies (set of all subsets of the unit square)

Page 7: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Two sides of a single coin

Mechanistic dependenceNeumaier: “correlation”

Computational dependenceNeumaier: “dependent” Francisco Cháves: decorrelation

Same representations used for both Maybe the same origin phenomenologically I’m mostly talking about mechanistic

Page 8: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Three special cases

0 u 1 0 u 1 0 u 1

1

v

0

1

v

0

1

v

0

Perfect(comonotonic)

Nondependent(the Fréchet case)

Opposite(countermonotonic)

Page 9: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Correlation

A model of dependence that’s parameterized by a (scalar) value called the “correlation coefficient”

: [1, +1] D

The correlation model is called “complete” if

(1) = , (0) = , (+1) =

Page 10: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

r = 1 r = 0 r = +1

Corner-shaving dependence

D(r) = { (u,v) : max(0, ur, u1+r) v min(1, u+1r, u+2+r)} u [0,1], v [0,1] 

f (A, B) = { c : c = f (u (a2 – a1) + a1, v (b2 – b1) + b1), (u,v) D }A+B = [env(w(A, r)+b1, a1+w(B,r)), env(a2+w(B,1+r),w(A,1+r)+b2)]

a1 if p < 0w([a1,a2], p) = a2 if 1 < p p(a2a1)+a1 otherwise

Page 11: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Other complete correlation families

r = 1 r = 0 r = +1

Page 12: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Elliptic dependence

Page 13: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Elliptic dependence

Not complete (because r = 0 isn’t nondependence)

r = 1 r = 0 r = +1

Page 14: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Parabolic dependence

Page 15: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

r = 1 r = 0 r = +1

Parabolic dependence

A variable and its square or square root have this dependence

Variables that are not related by squaring could also have this dependence relation

e.g., A = [1,5], B = [1,10]

Page 16: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

So what difference does it make?

Page 17: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

[ 5, 14] Perfect

[ 8, 11] Opposite

[ 7.1, 11.9] Corner-shaving (r = 0.7)

[ 7.27, 11.73] Elliptic (r = 0.7)

[ 5, 14] Upper, left

[ 5, 11] Lower, left

[ 8, 14] Upper, right

[ 5, 14] Lower, right

[ 6.5, 12.5] Diamond

[ 5, 14] Nondependent

A + B

A = [2,5]B = [3,9]

Page 18: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Eliciting dependence

As hard as getting intervals (maybe a bit worse)

Theoretical or “physics-based” arguments

Inference from empirical data Risk of loss of rigor at this step (just as there is

when we try to infer intervals from data)

Page 19: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Generalization to multiple dimensions

Pairwise Matrix of two-dimensional dependence relations Relatively easy to elicit

Multivariate Subset of the unit hypercube Potentially much better tightening

Computationally harder already NP-hard, so doesn’t spoil party

Page 20: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Computing

Sequence of binary operations Need to deduce dependencies of intermediate

results with each other and the original inputs Different calculation order may give different

results

Do all at once in one multivariate calculation Can be much more difficult computationally Can produce much better tightening

Page 21: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Living (in sin) with probabilists

Page 22: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Probability box (p-box)

0

1

1.0 2.0 3.00.0X

Cum

ulat

ive

prob

abil

ity

Interval bounds on an cumulative distribution function

Page 23: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Generalizes intervals and probability

Not a uniform distribution

Cum

ulat

ive

prob

abil

ity

0 10

20 30 400

1

10 20 30 400

1

10 20 300

1

Probability distribution

Probability box Interval

Page 24: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Probability bounds arithmetic

A B

What’s the sum of A+B?

0

1

0 2 4 6 8 10 12 14C

um

ula

tive

Pro

bab

ilit

y0

1

0 1 2 3 4 5 6

Cu

mu

lati

ve P

rob

abil

ity

Page 25: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Cartesian product

A+Bindependencenondependent

A[1,3]p1 = 1/3

A[3,5]p3 = 1/3

A[2,4]p2 = 1/3

B[2,8]q1 = 1/3

B[8,12]q3 = 1/3

B[6,10]q2 = 1/3

A+B[3,11]prob=1/9

A+B[5,13]prob=1/9

A+B[4,12]prob=1/9

A+B[7,13]prob=1/9

A+B[9,15]prob=1/9

A+B[8,14]prob=1/9

A+B[9,15]prob=1/9

A+B[11,17]prob=1/9

A+B[10,16]prob=1/9

Page 26: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

A+B, independent/nondependent

0 3 6 9 12 180.00

0.25

0.50

0.75

1.00

15

A+B

Cum

ulat

ive

prob

abil

ity

Page 27: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Opposite/nondependentA+Boppositenondependent

A[1,3]p1 = 1/3

A[3,5]p3 = 1/3

A[2,4]p2 = 1/3

B[2,8]q1 = 1/3

B[8,12]q3 = 1/3

B[6,10]q2 = 1/3

A+B[3,11]prob=0

A+B[5,13]prob=1/3

A+B[4,12]prob=0

A+B[7,13]prob=0

A+B[9,15]prob=0

A+B[8,14]prob=1/3

A+B[9,15]prob= 1/3

A+B[11,17]prob=0

A+B[10,16]prob=0

Page 28: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

A+B, opposite / nondependent

0

1

0 3 6 9 12 15 18

A+B

Cum

ulat

ive

prob

abil

ity

Page 29: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Opposite / oppositeA+Boppositeopposite

A[1,3]p1 = 1/3

A[3,5]p3 = 1/3

A[2,4]p2 = 1/3

B[2,8]q1 = 1/3

B[8,12]q3 = 1/3

B[6,10]q2 = 1/3

A+B[5,9]prob=0

A+B[7,11]prob=1/3

A+B[6,10]prob=0

A+B[9,11]prob=0

A+B[11,13]prob=0

A+B[10,12]prob=1/3

A+B[11,13]prob= 1/3

A+B[13,15]prob=0

A+B[12,14]prob=0

Page 30: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

A+B, opposite / opposite

0 3 6 9 12 180.00

0.25

0.50

0.75

1.00

15

A+B

Cum

ulat

ive

prob

abil

ity

Page 31: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Three answers say different things

0 3 6 9 12 180.00

0.25

0.50

0.75

1.00

15

A+B

Cum

ulat

ive

prob

abil

ity

Page 32: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Conclusions

Interval analysis automatically accounts for all possible dependencies Unlike probability theory, where the default

assumption often underestimates uncertainty

Information about dependencies isn’t usually used to tighten results, but it can be

Variable repetition is just a special kind of dependence

Page 33: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

End

Page 34: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Wishfulthinking

Prudent analysis

Failure

Success

Dumb luck

Negligence Honorable failure

Good engineering

Page 35: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Independence

In the context of precise probabilities, there was a unique notion of independence

In the context of imprecise probabilities, however, this notion disintegrates into several distinct concepts

The different kinds of independence behave differently in computations

Page 36: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Several definitions of independence H(x,y) = F(x) G(y) , for all values x and y P(XI, YJ) = P(XI) P(YJ), for any I, J R h(x,y) = f(x) g(y) , for all values x and y E(w(X) z(Y)) = E(w(X)) E(z(Y)), for arbitrary w, z X,Y(t,s) = X(t) Y(s), for arbitrary t and s

P(X x) = F(x), P(Y y) = G(y) and P(X x, Y y) = H(x, y);f, g and h are the density analogs of F,G and H; and denotes the Fourier transform

For precise probabilities, all these definitions are equivalent, so there’s a single concept

Equivalent definitions of independence

Page 37: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Imprecise probability independence Random-set independence Epistemic irrelevance (asymmetric) Epistemic independence Strong independence Repetition independence Others?

Which should be called ‘independence’?

Page 38: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Notation

X and Y are random variables FX and FY are their probability distributions

FX and FY aren’t known precisely, but we

know they’re within classes MX and MY

X ~ FX MX

Y ~ FY MY

Page 39: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Repetition independence

X and Y are random variables X and Y are independent (in the traditional sense) X and Y are identically distributed according to F F is unknown, but we know that F M

X and Y are repetition independent

Analog of iid (independent and identically distributed)

MX,Y = {H : H(x, y) = F(x) F(y), F M}

Page 40: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Strong independence

X ~ FX MX and Y ~ FY MY

X and Y are stochastically independent All possible combinations of distributions from MX and MY are allowed

X and Y are strongly independent

Complete absence of any relationship between X, Y

MX,Y = {H : H(x, y) = FX(x) FY(y),

FX MX, FY MY}

Page 41: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Epistemic independence

X ~ FX MX and Y ~ FY MY

E(f(X)|Y) = E(f(X)) and

E(f(Y)|X) = E(f(Y)) for all functions f

where E is the smallest mean over all possible probability distributions

X and Y are epistemically independent

Lower bounds on expectations generalize the conditions P(X|Y) = P(X) and P(Y|X) = P(Y)

Page 42: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Random-set independence

Embodied in Cartesian products

X and Y with mass functions mX and mY are random-set independent if the Dempster-Shafer structure for their joint distribution has mass function m(A1A2) = mX (A1) mY (A2) whenever A1 is a focal element of X and A2 is a focal element of Y, and m(A) = 0 otherwise

Often easiest to compute

Page 43: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Repetition

Strong

Epistemic

Random-set

Repetition

Strong

Epistemic

Random-set

Repetition

Strong

Epistemic

Random-set

(Uncorrelated) (Nondependent)These cases of independence are nested.

Page 44: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Interesting example

X = [1, +1], Y ={([1, 0], ½), ([0, 1], ½)}

If X and Y are “independent”, what is Z = XY ?

1 0 +10

1

X

X

1 0 +10

1

Y

Y

Page 45: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Compute via Yager’s convolution

Y ([1, 0], ½) ([0, 1], ½)

X ([1, +1], 1) ([1, +1], ½) ([1, +1], ½)

1 0 +10

1

XY

XY

The Cartesian product with one row and two columns produces this p-box

Page 46: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

But consider the means

Clearly, EX = [1,+1] and EY=[½, +½]. Therefore, E(XY) = [½, +½]. But if this is the mean of the product, and its

range is [1,+1], then we know better bounds on the CDF.

XY

1 0 +10

1

XY

Page 47: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

And consider the quantity signs

What’s the probability PZ that Z < 0?

Z < 0 only if X < 0 or Y < 0 (but not both) PZ = PX(1PY) + PY(1PX), where

PX = P(X < 0), PY = P(Y < 0)

But PY is ½ by construction

So PZ = ½PX + ½(1PX) = ½

Thus, zero is the median of Z Knowing median and range improves bounds1 0 +1

0

1

XY

XY

Page 48: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Best possible

These bounds are realized by solutionsIf X = 0, then Z=0If X = Y = B = {(1, ½),(+1, ½)}, then Z = B

So these bounds are also best possible1 0 +10

1 B

1 0 +10

1

Z=0

1 0 +10

1

XY

XY

Page 49: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

1 0 +10

1

XY

XY

1 0 +10

1

XY

XYXY

1 0 +10

1

XY

Random-set independence

Strong independence

Moment independence

So which is correct?

The answer depends on what one meant by “independent”.

Page 50: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

So what?

The example illustrates a practical difference between random-set independence and strong independence

It disproves the conjecture that the convolution of uncertain numbers is not affected by dependence assumptions if at least one of them is an interval

It tempers the claim about the best-possible nature of convolutions with probability boxes and Dempster-Shafer structures

Page 51: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Strategy for risk analysts

Random-set independence is conservative

Using the Cartesian product approach is always rigorous, though may not be optimal

Convenient methods to obtain tighter bounds under other kinds of independence await discovery

Page 52: Modeling correlations and dependencies among intervals Scott Ferson and Vladik Kreinovich REC’06 Savannah, Georgia, 23 February 2006.

Uncertainty algebra for convolutions

Operands and operation Answers under different dependence assumptions1) One interval random-set = unknown 2) Two intervals strong = epistemic = random-set = unknown3) Interval and a function of an interval strong = epistemic = random-set = unknown 4) One interval and monotone operation strong = epistemic = random-set = unknown 5) Monotone operation strong = epistemic = random-set 6) Two precise distributions strong = epistemic = random-set7) All cases repetition strong epistemic random-set unknown

(Colored words denote the set of distributions that result from binary operations invoking the specified assumption about dependence)

(after Fetz and Oberguggenberger 2004)