Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? •...

188
Probability in Programming Prakash Panangaden School of Computer Science McGill University

Transcript of Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? •...

Page 1: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Probability in Programming

Prakash Panangaden School of Computer Science

McGill University

Page 2: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Two crucial issues

Page 3: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Two crucial issues

• Correctness of software:

Page 4: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Two crucial issues

• Correctness of software:

• with respect to precise specifications.

Page 5: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Two crucial issues

• Correctness of software:

• with respect to precise specifications.

• Efficiency of code:

Page 6: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Two crucial issues

• Correctness of software:

• with respect to precise specifications.

• Efficiency of code:

• based on well-designed algorithms.

Page 7: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Logic is the key

Page 8: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Logic is the key• Precise specifications have to be made in a formal

language

Page 9: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Logic is the key• Precise specifications have to be made in a formal

language

• with a rigourous definition of meaning.

Page 10: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Logic is the key• Precise specifications have to be made in a formal

language

• with a rigourous definition of meaning.

• Logic is the “calculus” of computer science.

Page 11: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Logic is the key• Precise specifications have to be made in a formal

language

• with a rigourous definition of meaning.

• Logic is the “calculus” of computer science.

• It comes with a framework for reasoning.

Page 12: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Logic is the key• Precise specifications have to be made in a formal

language

• with a rigourous definition of meaning.

• Logic is the “calculus” of computer science.

• It comes with a framework for reasoning.

• Many kinds of logic: propositional, predicate, modal, .....

Page 13: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Probability

Page 14: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Probability• Probability is also a framework for reasoning

Page 15: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Probability• Probability is also a framework for reasoning

• quantitatively.

Page 16: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Probability• Probability is also a framework for reasoning

• quantitatively.

• But is this relevant for computer programmers?

Page 17: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Probability• Probability is also a framework for reasoning

• quantitatively.

• But is this relevant for computer programmers?

• Yes!

Page 18: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Probability• Probability is also a framework for reasoning

• quantitatively.

• But is this relevant for computer programmers?

• Yes!

• Probabilistic reasoning is everywhere.

Page 19: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Some quotations

Page 20: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Some quotations

• The true logic of the world is the calculus of probabilities — James Clerk Maxwell

Page 21: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Some quotations

• The true logic of the world is the calculus of probabilities — James Clerk Maxwell

• The theory of probabilities is at bottom nothing but common sense reduced to calculus — Pierre Simon Laplace

Page 22: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Why does one use probability?

Page 23: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Why does one use probability?

• Some algorithms use probability as a computational resource: randomized algorithms.

Page 24: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Why does one use probability?

• Some algorithms use probability as a computational resource: randomized algorithms.

• Software for interacting with physical systems have to cope with noise and uncertainty: telecommunications, robotics, vision, control systems, ....

Page 25: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Why does one use probability?

• Some algorithms use probability as a computational resource: randomized algorithms.

• Software for interacting with physical systems have to cope with noise and uncertainty: telecommunications, robotics, vision, control systems, ....

• Big data and machine learning: probabilistic reasoning has had a revolutionary impact.

Page 26: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Basic Ideas

Page 27: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Basic Ideas

Sample space X: the set of things that can

possibly happen.

Page 28: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Basic Ideas

Sample space X: the set of things that can

possibly happen.

Event: subset of the sample space; A,B ⇢ X.

Page 29: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Basic Ideas

Sample space X: the set of things that can

possibly happen.

Event: subset of the sample space; A,B ⇢ X.

Probability: Pr : X ! [0, 1],

Px2X

Pr(x) = 1.

Page 30: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Basic Ideas

Sample space X: the set of things that can

possibly happen.

Event: subset of the sample space; A,B ⇢ X.

Probability: Pr : X ! [0, 1],

Px2X

Pr(x) = 1.

Probability of an event A: Pr(A) =

Px2A

Pr(x).

Page 31: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Basic Ideas

Sample space X: the set of things that can

possibly happen.

Event: subset of the sample space; A,B ⇢ X.

Probability: Pr : X ! [0, 1],

Px2X

Pr(x) = 1.

Probability of an event A: Pr(A) =

Px2A

Pr(x).

A,B are independent: Pr(A \B) = Pr(A) · Pr(B).

Page 32: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Basic Ideas

Sample space X: the set of things that can

possibly happen.

Event: subset of the sample space; A,B ⇢ X.

Probability: Pr : X ! [0, 1],

Px2X

Pr(x) = 1.

Probability of an event A: Pr(A) =

Px2A

Pr(x).

A,B are independent: Pr(A \B) = Pr(A) · Pr(B).

Subprobability:

Px2X

Pr(x) 1.

Page 33: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

A Puzzle

Page 34: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

A Puzzle

Imagine a town where every birth is equally likely

to give a boy or a girl. Pr(boy) = Pr(girl) = 12 .

Page 35: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

A Puzzle

Imagine a town where every birth is equally likely

to give a boy or a girl. Pr(boy) = Pr(girl) = 12 .

Each birth is an independent random event.

Page 36: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

A Puzzle

Imagine a town where every birth is equally likely

to give a boy or a girl. Pr(boy) = Pr(girl) = 12 .

Each birth is an independent random event.

There is a family with two children.

Page 37: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

A Puzzle

Imagine a town where every birth is equally likely

to give a boy or a girl. Pr(boy) = Pr(girl) = 12 .

Each birth is an independent random event.

There is a family with two children.

One of them is a boy (not specified which one), whatis the probability that the other one is a boy?

Page 38: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

A Puzzle

Imagine a town where every birth is equally likely

to give a boy or a girl. Pr(boy) = Pr(girl) = 12 .

Each birth is an independent random event.

There is a family with two children.

One of them is a boy (not specified which one), whatis the probability that the other one is a boy?

Since the births are independent, the probability

that the other child is a boy should be

12 . Right?

Page 39: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Puzzle (continued)

Page 40: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Puzzle (continued)

Wrong!

Page 41: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Puzzle (continued)

Wrong!

Initially, there are 4 equally likely situations:

bb, bg, gb, gg.

Page 42: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Puzzle (continued)

Wrong!

Initially, there are 4 equally likely situations:

bb, bg, gb, gg.

The possibility gg is ruled out with the

additional information.

Page 43: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Puzzle (continued)

Wrong!

Initially, there are 4 equally likely situations:

bb, bg, gb, gg.

The possibility gg is ruled out with the

additional information.

So of the three equally likely scenarios:

bb, bg, gb,

only one has the other child being a boy.

Page 44: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Puzzle (continued)

Wrong!

Initially, there are 4 equally likely situations:

bb, bg, gb, gg.

The possibility gg is ruled out with the

additional information.

So of the three equally likely scenarios:

bb, bg, gb,

only one has the other child being a boy.

The correct answer is

13 .

Page 45: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Puzzle (continued)

Wrong!

Initially, there are 4 equally likely situations:

bb, bg, gb, gg.

The possibility gg is ruled out with the

additional information.

So of the three equally likely scenarios:

bb, bg, gb,

only one has the other child being a boy.

The correct answer is

13 .

If I had said, “The elder child is a boy”, then the probability

that the other child is a boy is indeed

12

Page 46: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Conditional probability

Page 47: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Conditional probability

Conditioning = revising probability in the presence of

new information.

Page 48: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Conditional probability

Conditioning = revising probability in the presence of

new information.

Conditional probability/expectation is the heart of

probabilistic reasoning.

Page 49: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Conditional probability

Conditioning = revising probability in the presence of

new information.

Conditional probability/expectation is the heart of

probabilistic reasoning.

Conditional probability is tricky!

Page 50: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Conditional probability

Conditioning = revising probability in the presence of

new information.

Conditional probability/expectation is the heart of

probabilistic reasoning.

Conditional probability is tricky!

Analogous to inference in ordinary logic.

Page 51: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Conditional probability

Page 52: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Conditional probability

Definition: if A and B are events, the conditional probability

of A given B, written Pr(A | B) is defined by

Pr(A | B) = Pr(A \B)/Pr(B).

Page 53: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Conditional probability

Definition: if A and B are events, the conditional probability

of A given B, written Pr(A | B) is defined by

Pr(A | B) = Pr(A \B)/Pr(B).

We are told that the outcome is one of the possibilities

in B. We now need to change our guess for the outcome

being in A.

Page 54: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Conditional probability

Definition: if A and B are events, the conditional probability

of A given B, written Pr(A | B) is defined by

Pr(A | B) = Pr(A \B)/Pr(B).

We are told that the outcome is one of the possibilities

in B. We now need to change our guess for the outcome

being in A.

A

B

Page 55: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Bayes’ Rule

Pr(A | B) =Pr(B | A) · Pr(A)

Pr(B).

How to revise probabilities.

Page 56: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Bayes’ Rule

Pr(A | B) =Pr(B | A) · Pr(A)

Pr(B).

How to revise probabilities.

Proof is just from the definition.

Page 57: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Example

Page 58: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Example

Two coins, one fake (two heads) one OK.

Page 59: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Example

Two coins, one fake (two heads) one OK.

One coin chosen with equal probability and then tossedto yield a H.

Page 60: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Example

Two coins, one fake (two heads) one OK.

One coin chosen with equal probability and then tossedto yield a H.

What is the probability the coin was fake?

Page 61: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Example

Two coins, one fake (two heads) one OK.

One coin chosen with equal probability and then tossedto yield a H.

What is the probability the coin was fake?

Answer: 23 .

Page 62: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Example

Two coins, one fake (two heads) one OK.

One coin chosen with equal probability and then tossedto yield a H.

What is the probability the coin was fake?

Answer: 23 .

Pr(H | Fake) = 1,Pr(Fake) = 12 ,Pr(H) = 1

2 · 1 + 12 · 1

2 = 34 .

Page 63: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Example

Two coins, one fake (two heads) one OK.

One coin chosen with equal probability and then tossedto yield a H.

What is the probability the coin was fake?

Answer: 23 .

Pr(H | Fake) = 1,Pr(Fake) = 12 ,Pr(H) = 1

2 · 1 + 12 · 1

2 = 34 .

Hence Pr(Fake | H) = ( 12 )/(34 ) =

23 .

Page 64: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Example

Two coins, one fake (two heads) one OK.

One coin chosen with equal probability and then tossedto yield a H.

What is the probability the coin was fake?

Answer: 23 .

Pr(H | Fake) = 1,Pr(Fake) = 12 ,Pr(H) = 1

2 · 1 + 12 · 1

2 = 34 .

Hence Pr(Fake | H) = ( 12 )/(34 ) =

23 .

Similarly Pr(Fake | HHH) = 89 .

Page 65: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Example

Two coins, one fake (two heads) one OK.

One coin chosen with equal probability and then tossedto yield a H.

What is the probability the coin was fake?

Answer: 23 .

Pr(H | Fake) = 1,Pr(Fake) = 12 ,Pr(H) = 1

2 · 1 + 12 · 1

2 = 34 .

Hence Pr(Fake | H) = ( 12 )/(34 ) =

23 .

Similarly Pr(Fake | HHH) = 89 . Pr(Fake | H . . .H| {z }

n

) = 11+( 1

2 )n .

Page 66: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Bayes’ rule shows how to update the prior probability of Awith the new information that the outcome was B: this

gives the posterior probability of A given B.

Page 67: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Expectation values

Page 68: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Expectation values

A random variable r is a real-valued function on X.

Page 69: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Expectation values

A random variable r is a real-valued function on X.

The expectation value of r is E[r] =X

x2X

Pr(x)r(x).

Page 70: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Expectation values

A random variable r is a real-valued function on X.

The expectation value of r is E[r] =X

x2X

Pr(x)r(x).

The conditional expectation value of r given A is:

Page 71: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Expectation values

A random variable r is a real-valued function on X.

The expectation value of r is E[r] =X

x2X

Pr(x)r(x).

The conditional expectation value of r given A is:

E[r | A] =X

x2X

r(x)Pr({x} | A).

Page 72: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Expectation values

A random variable r is a real-valued function on X.

The expectation value of r is E[r] =X

x2X

Pr(x)r(x).

The conditional expectation value of r given A is:

E[r | A] =X

x2X

r(x)Pr({x} | A).

Conditional probability is a special case of

conditional expectation.

Page 73: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Calculating expectations through conditioning

Page 74: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Calculating expectations through conditioning

Two people roll dice independently.

Page 75: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Calculating expectations through conditioning

Two people roll dice independently.

Page 76: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Calculating expectations through conditioning

Two people roll dice independently.

The first one keeps rolling until

he gets a 1 immediately followed by a 2.

Page 77: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Calculating expectations through conditioning

Two people roll dice independently.

The second one keeps rolling

until she gets a 1 and a 1.

The first one keeps rolling until

he gets a 1 immediately followed by a 2.

Page 78: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Calculating expectations through conditioning

Two people roll dice independently.

The second one keeps rolling

until she gets a 1 and a 1.

The first one keeps rolling until

he gets a 1 immediately followed by a 2.

Do they have the same expected number of rolls?

Page 79: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Calculating expectations through conditioning

Two people roll dice independently.

The second one keeps rolling

until she gets a 1 and a 1.

The first one keeps rolling until

he gets a 1 immediately followed by a 2.

Do they have the same expected number of rolls?

If not, who is expected to finish first?

Page 80: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Calculating expectations through conditioning

Two people roll dice independently.

The second one keeps rolling

until she gets a 1 and a 1.

The first one keeps rolling until

he gets a 1 immediately followed by a 2.

Do they have the same expected number of rolls?

If not, who is expected to finish first?

What is the expected number of rolls for each one?

Page 81: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.
Page 82: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

For the first:

16 · (1 + 1

6 · 1 + . . .) + . . .

Page 83: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

For the first:

16 · (1 + 1

6 · 1 + . . .) + . . .

Is there a better way?

Page 84: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

For the first:

16 · (1 + 1

6 · 1 + . . .) + . . .

Is there a better way?

Use conditional expectations and think in terms of

state-transition diagrams:

Page 85: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

For the first:

16 · (1 + 1

6 · 1 + . . .) + . . .

Is there a better way?

Use conditional expectations and think in terms of

state-transition diagrams:

1

3, 4, 5, 6

3, 4, 5, 6

1 2

2,

Page 86: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

1

3, 4, 5, 6

3, 4, 5, 6

1 2

2,

Page 87: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Let x = E[Finish | Start]

1

3, 4, 5, 6

3, 4, 5, 6

1 2

2,

Page 88: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Let x = E[Finish | Start] Let y = E[Finish | 1]

1

3, 4, 5, 6

3, 4, 5, 6

1 2

2,

Page 89: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Let x = E[Finish | Start] Let y = E[Finish | 1]

1

3, 4, 5, 6

3, 4, 5, 6

1 2

2,

x = 56 · (1 + x) + 1

6 · (1 + y)

Page 90: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Let x = E[Finish | Start] Let y = E[Finish | 1]

1

3, 4, 5, 6

3, 4, 5, 6

1 2

2,

x = 56 · (1 + x) + 1

6 · (1 + y)

y = 16 · 1 + 1

6 · (1 + y) + 23 · (1 + x)

Page 91: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Let x = E[Finish | Start] Let y = E[Finish | 1]

1

3, 4, 5, 6

3, 4, 5, 6

1 2

2,

x = 56 · (1 + x) + 1

6 · (1 + y)

y = 16 · 1 + 1

6 · (1 + y) + 23 · (1 + x)

Easy to solve: x = 30, y = 36.

Page 92: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

1 1

2, 3, 4, 5, 6

2, 3, 4, 5, 6

Page 93: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

1 1

2, 3, 4, 5, 6

2, 3, 4, 5, 6

Let x = E[Finish | Start]

Page 94: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

1 1

2, 3, 4, 5, 6

2, 3, 4, 5, 6

Let x = E[Finish | Start] Let y = E[Finish | 1]

Page 95: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

1 1

2, 3, 4, 5, 6

2, 3, 4, 5, 6

Let x = E[Finish | Start] Let y = E[Finish | 1]

x = 16 · (1 + y) + 5

6 · (1 + x)

Page 96: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

1 1

2, 3, 4, 5, 6

2, 3, 4, 5, 6

Let x = E[Finish | Start] Let y = E[Finish | 1]

x = 16 · (1 + y) + 5

6 · (1 + x)

y = 16 · 1 + 5

6 · (1 + x)

Page 97: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

1 1

2, 3, 4, 5, 6

2, 3, 4, 5, 6

Let x = E[Finish | Start] Let y = E[Finish | 1]

x = 16 · (1 + y) + 5

6 · (1 + x)

y = 16 · 1 + 5

6 · (1 + x)

Easy to solve: x = 42, y = 36.

Page 98: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

1 1

2, 3, 4, 5, 6

2, 3, 4, 5, 6

Let x = E[Finish | Start] Let y = E[Finish | 1]

x = 16 · (1 + y) + 5

6 · (1 + x)

y = 16 · 1 + 5

6 · (1 + x)

Easy to solve: x = 42, y = 36.

Did you expect this to be the slower one?

Page 99: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Understanding programs

Page 100: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Understanding programs

The state of a program is the correspondence between

names and values.

Page 101: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Understanding programs

The state of a program is the correspondence between

names and values.

[X 7! 3, Y 7! 4, Z 7! �2.5]

Page 102: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Understanding programs

The state of a program is the correspondence between

names and values.

Running a part of a program changes the state.

[X 7! 3, Y 7! 4, Z 7! �2.5]

Page 103: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Understanding programs

The state of a program is the correspondence between

names and values.

Running a part of a program changes the state.

[X 7! 3, Y 7! 4, Z 7! �2.5]

if X > 1 then Y = Y + Z else Y = Z

Page 104: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Understanding programs

The state of a program is the correspondence between

names and values.

Running a part of a program changes the state.

[X 7! 3, Y 7! 4, Z 7! �2.5]

if X > 1 then Y = Y + Z else Y = Z

[X 7! 3, Y 7! 4, Z 7! �2.5] �! [X 7! 3, Y 7! 1.5, Z 7! �2.5]

Page 105: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Understanding programs

The state of a program is the correspondence between

names and values.

Running a part of a program changes the state.

[X 7! 3, Y 7! 4, Z 7! �2.5]

if X > 1 then Y = Y + Z else Y = Z

[X 7! 3, Y 7! 4, Z 7! �2.5] �! [X 7! 3, Y 7! 1.5, Z 7! �2.5]

Ordinary programs define state-transformer functions.

Page 106: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Understanding programs

The state of a program is the correspondence between

names and values.

Running a part of a program changes the state.

[X 7! 3, Y 7! 4, Z 7! �2.5]

if X > 1 then Y = Y + Z else Y = Z

[X 7! 3, Y 7! 4, Z 7! �2.5] �! [X 7! 3, Y 7! 1.5, Z 7! �2.5]

Ordinary programs define state-transformer functions.

When one combines program pieces one can

compose the functions to find the combined e↵ect.

Page 107: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Understanding programs

The state of a program is the correspondence between

names and values.

Running a part of a program changes the state.

[X 7! 3, Y 7! 4, Z 7! �2.5]

if X > 1 then Y = Y + Z else Y = Z

[X 7! 3, Y 7! 4, Z 7! �2.5] �! [X 7! 3, Y 7! 1.5, Z 7! �2.5]

Ordinary programs define state-transformer functions.

How do we understand probabilistic programs?

When one combines program pieces one can

compose the functions to find the combined e↵ect.

Page 108: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Understanding programs

The state of a program is the correspondence between

names and values.

Running a part of a program changes the state.

[X 7! 3, Y 7! 4, Z 7! �2.5]

if X > 1 then Y = Y + Z else Y = Z

[X 7! 3, Y 7! 4, Z 7! �2.5] �! [X 7! 3, Y 7! 1.5, Z 7! �2.5]

Ordinary programs define state-transformer functions.

How do we understand probabilistic programs?

As distribution transformers.

When one combines program pieces one can

compose the functions to find the combined e↵ect.

Page 109: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

X = 0; C = toss; if C = 1 then X = X + 1 else X = X � 1.

Page 110: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

X = 0; C = toss; if C = 1 then X = X + 1 else X = X � 1.

Initial distribution: [X 7! (0, 1.0), C 7! (0, 1.0)]

Page 111: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

X = 0; C = toss; if C = 1 then X = X + 1 else X = X � 1.

Initial distribution: [X 7! (0, 1.0), C 7! (0, 1.0)]

Final distribution: [X 7! (1, 0.5)(�1, 0.5), C 7! (0, 0.5)(1, 0.5)]

Page 112: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

X = 0; C = toss; if C = 1 then X = X + 1 else X = X � 1.

Initial distribution: [X 7! (0, 1.0), C 7! (0, 1.0)]

Final distribution: [X 7! (1, 0.5)(�1, 0.5), C 7! (0, 0.5)(1, 0.5)]

A Markov chain has S: states and a probability

distribution transformer T .

Page 113: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

X = 0; C = toss; if C = 1 then X = X + 1 else X = X � 1.

Initial distribution: [X 7! (0, 1.0), C 7! (0, 1.0)]

Final distribution: [X 7! (1, 0.5)(�1, 0.5), C 7! (0, 0.5)(1, 0.5)]

A Markov chain has S: states and a probability

distribution transformer T .

T : St⇥ St ! [0, 1] or T : St ! Dist(St).

Page 114: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

X = 0; C = toss; if C = 1 then X = X + 1 else X = X � 1.

Initial distribution: [X 7! (0, 1.0), C 7! (0, 1.0)]

Final distribution: [X 7! (1, 0.5)(�1, 0.5), C 7! (0, 0.5)(1, 0.5)]

A Markov chain has S: states and a probability

distribution transformer T .

T : St⇥ St ! [0, 1] or T : St ! Dist(St).

T (s1, s2) is the conditional probability of being

in state s2 after the transition given that the

state was s1 before.

Page 115: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

X = 0; C = toss; if C = 1 then X = X + 1 else X = X � 1.

Initial distribution: [X 7! (0, 1.0), C 7! (0, 1.0)]

Final distribution: [X 7! (1, 0.5)(�1, 0.5), C 7! (0, 0.5)(1, 0.5)]

A Markov chain has S: states and a probability

distribution transformer T .

T : St⇥ St ! [0, 1] or T : St ! Dist(St).

T (s1, s2) is the conditional probability of being

in state s2 after the transition given that the

state was s1 before.

Markov property: the transition probability only depends

on the current state, not on the whole history.

Page 116: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Because of the Markov property, one can describe

the e↵ect of a transition by a matrix.

Page 117: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Because of the Markov property, one can describe

the e↵ect of a transition by a matrix.

When one combines probabilistic program pieces

one can multiply the transition matrices to find

the combined e↵ect.

Page 118: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Because of the Markov property, one can describe

the e↵ect of a transition by a matrix.

When one combines probabilistic program pieces

one can multiply the transition matrices to find

the combined e↵ect.

We are understanding the program by stepping forwards.

Page 119: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Because of the Markov property, one can describe

the e↵ect of a transition by a matrix.

When one combines probabilistic program pieces

one can multiply the transition matrices to find

the combined e↵ect.

We are understanding the program by stepping forwards.

This is called “forwards” or state-transformer semantics.

Page 120: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Backwards semantics

Page 121: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Maybe we do not want to track every detail of the state

as it changes.

Backwards semantics

Page 122: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Maybe we do not want to track every detail of the state

as it changes.

Backwards semantics

Perhaps we want to know if a property holds,

e.g. X > 0.

Page 123: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Maybe we do not want to track every detail of the state

as it changes.

Backwards semantics

Perhaps we want to know if a property holds,

e.g. X > 0.

We write {P} step {Q} to mean that P holds

before the step and Q holds after the step.

Page 124: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Maybe we do not want to track every detail of the state

as it changes.

Backwards semantics

Perhaps we want to know if a property holds,

e.g. X > 0.

We write {P} step {Q} to mean that P holds

before the step and Q holds after the step.

{X > 0} X = X � 5 {X > 0} ??

Page 125: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Maybe we do not want to track every detail of the state

as it changes.

Backwards semantics

Perhaps we want to know if a property holds,

e.g. X > 0.

We write {P} step {Q} to mean that P holds

before the step and Q holds after the step.

{X > 0} X = X � 5 {X > 0} ??

We cannot say anything for sure after the step!

Page 126: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

But we can go backwards!

Page 127: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

But we can go backwards!

{X > 5} X = X � 5 {X > 0}

Page 128: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

But we can go backwards!

{X > 5} X = X � 5 {X > 0} �

Page 129: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

But we can go backwards!

{X > 5} X = X � 5 {X > 0}

We must read this di↵erently: if we want X > 0

after the step, we must make sure X > 5 before the step.

Page 130: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

But we can go backwards!

{X > 5} X = X � 5 {X > 0}

We must read this di↵erently: if we want X > 0

after the step, we must make sure X > 5 before the step.

This is called “predicate-transformer” semantics.

Page 131: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

But we can go backwards!

{X > 5} X = X � 5 {X > 0}

We must read this di↵erently: if we want X > 0

after the step, we must make sure X > 5 before the step.

This is called “predicate-transformer” semantics.

Can we do something like this for probabilistic programs?

Page 132: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Classical logic Generalization

Truth values {0, 1} Probabilities [0, 1]Predicate Random variable

State Distribution

The satisfaction relation |= Integration

R

Page 133: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Instead of predicates we have real-valued functions.

Page 134: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Instead of predicates we have real-valued functions.

Suppose that your probabilistic program describes a

search for some resource.

Page 135: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Instead of predicates we have real-valued functions.

Suppose that your probabilistic program describes a

search for some resource.

Suppose that r : St ! R is the “expected reward”

in each state.

Page 136: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Instead of predicates we have real-valued functions.

Suppose that your probabilistic program describes a

search for some resource.

Suppose that r : St ! R is the “expected reward”

in each state.

We write

ˆTr for a new reward function defined by

(

ˆTr)(s) =X

s02S

T (s, s0)r(s0).

Page 137: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Instead of predicates we have real-valued functions.

Suppose that your probabilistic program describes a

search for some resource.

Suppose that r : St ! R is the “expected reward”

in each state.

We write

ˆTr for a new reward function defined by

(

ˆTr)(s) =X

s02S

T (s, s0)r(s0).

This tells you the expected reward before the transition

assuming that r is the reward after the transition.

Page 138: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Probabilistic Models

Page 139: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Probabilistic Models

• A general framework to reason about situations computationally and quantitatively.

Page 140: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Probabilistic Models

• A general framework to reason about situations computationally and quantitatively.

• Most important class of models: graphical models.

Page 141: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Probabilistic Models

• A general framework to reason about situations computationally and quantitatively.

• Most important class of models: graphical models.

• They capture dependence and independence and

Page 142: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Probabilistic Models

• A general framework to reason about situations computationally and quantitatively.

• Most important class of models: graphical models.

• They capture dependence and independence and

• conditional independence.

Page 143: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Complex systems

Page 144: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Complex systems• Complex systems have many variables.

Page 145: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Complex systems• Complex systems have many variables.

• They are related in intricate ways: correlated or independent.

Page 146: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Complex systems• Complex systems have many variables.

• They are related in intricate ways: correlated or independent.

• They may be conditionally independent or dependent.

Page 147: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Complex systems• Complex systems have many variables.

• They are related in intricate ways: correlated or independent.

• They may be conditionally independent or dependent.

• There may be causal connections.

Page 148: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Complex systems• Complex systems have many variables.

• They are related in intricate ways: correlated or independent.

• They may be conditionally independent or dependent.

• There may be causal connections.

• We want to represent several random variables that may be connected in different ways.

Page 149: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

A simple graphical model

Page 150: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

A simple graphical model

Trying to analyze what is wrong with a web site:

Page 151: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

A simple graphical model

Trying to analyze what is wrong with a web site:

could be buggy (yes/no)

Page 152: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

A simple graphical model

Trying to analyze what is wrong with a web site:

could be buggy (yes/no)

could be under attack (yes/no)

Page 153: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

A simple graphical model

Trying to analyze what is wrong with a web site:

could be buggy (yes/no)

could be under attack (yes/no)

tra�c (very high/high/moderate/low)

Page 154: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

A simple graphical model

Trying to analyze what is wrong with a web site:

could be buggy (yes/no)

could be under attack (yes/no)

tra�c (very high/high/moderate/low)

Symptoms: crashed or slow

Page 155: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

A simple graphical model

Trying to analyze what is wrong with a web site:

could be buggy (yes/no)

could be under attack (yes/no)

tra�c (very high/high/moderate/low)

Symptoms: crashed or slow

There are 64 states.

Page 156: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.
Page 157: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

The arrows show probabilistic dependencies.

Page 158: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

The arrows show probabilistic dependencies.

Everything in the top row is independent of each other.

Page 159: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

The arrows show probabilistic dependencies.

Everything in the top row is independent of each other.

We write A?B for A is independent of B.

Page 160: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

The arrows show probabilistic dependencies.

Everything in the top row is independent of each other.

We write A?B for A is independent of B.

Perhaps too simple:

if attacked then the tra�c should be heavy.

Page 161: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

This version: tra�c is a↵ected by being attacked.

Page 162: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Medical example (from Koller and Friedman)

Page 163: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Medical example (from Koller and Friedman)

Flu and allergy are correlated through season.

Page 164: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Medical example (from Koller and Friedman)

Flu and allergy are correlated through season.

Given the season, they are independent: (A?F | S)

Page 165: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.
Page 166: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Independence relations:

Page 167: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Independence relations:

(F?A | S), (Sn?S | F,A), (P?A,Sn | F ), (P | Sn | F )

Page 168: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Independence relations:

(F?A | S), (Sn?S | F,A), (P?A,Sn | F ), (P | Sn | F )

P (Sn | F,A, S) = P (Sn | F,A)

Page 169: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Independence relations:

(F?A | S), (Sn?S | F,A), (P?A,Sn | F ), (P | Sn | F )

P (Sn | F,A, S) = P (Sn | F,A)

Sn depends on S but it is conditionally independent given A and F .

Page 170: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Independence and Factorization

Page 171: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Independence and Factorization

Why do we care about conditional independence?

Page 172: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Independence and Factorization

Why do we care about conditional independence?

Because we can factorize the joint distributions.

Page 173: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Independence and Factorization

Why do we care about conditional independence?

Because we can factorize the joint distributions.

Page 174: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Independence and Factorization

Why do we care about conditional independence?

Because we can factorize the joint distributions.

P (S,A, F, P, Sn) =

P (S)P (F |S)P (A|S)P (Sn|A,F )P (Pain|F )

Page 175: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Independence and Factorization

Why do we care about conditional independence?

Because we can factorize the joint distributions.

P (S,A, F, P, Sn) =

P (S)P (F |S)P (A|S)P (Sn|A,F )P (Pain|F )

A huge advantage for representing, computing and reasoning.

Page 176: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Where can we go from here?

Page 177: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Where can we go from here?

Continuous state spaces: robotics, telecommunication,

control systems, sensor systems.

Page 178: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Where can we go from here?

Continuous state spaces: robotics, telecommunication,

control systems, sensor systems.

Requires measure theory and analysis.

Page 179: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Where can we go from here?

Continuous state spaces: robotics, telecommunication,

control systems, sensor systems.

Requires measure theory and analysis. Why?

Page 180: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Where can we go from here?

Continuous state spaces: robotics, telecommunication,

control systems, sensor systems.

Requires measure theory and analysis. Why?

It is subtle to answer questions like:

Page 181: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Where can we go from here?

Continuous state spaces: robotics, telecommunication,

control systems, sensor systems.

Requires measure theory and analysis. Why?

It is subtle to answer questions like:

I choose a real number between 0 and 1 uniformly, what is the probability of

getting a rational number?

Page 182: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Where can we go from here?

Continuous state spaces: robotics, telecommunication,

control systems, sensor systems.

Requires measure theory and analysis. Why?

It is subtle to answer questions like:

I choose a real number between 0 and 1 uniformly, what is the probability of

getting a rational number? Answer: 0!

Page 183: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Where can we go from here?

Continuous state spaces: robotics, telecommunication,

control systems, sensor systems.

Requires measure theory and analysis. Why?

It is subtle to answer questions like:

I choose a real number between 0 and 1 uniformly, what is the probability of

getting a rational number? Answer: 0!

If every single point has probability 0 how can I have anything interesting?

Page 184: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Where can we go from here?

Continuous state spaces: robotics, telecommunication,

control systems, sensor systems.

Requires measure theory and analysis. Why?

It is subtle to answer questions like:

I choose a real number between 0 and 1 uniformly, what is the probability of

getting a rational number? Answer: 0!

If every single point has probability 0 how can I have anything interesting?

Is it possible to assign a probability to every set?

Page 185: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Where can we go from here?

Continuous state spaces: robotics, telecommunication,

control systems, sensor systems.

Requires measure theory and analysis. Why?

It is subtle to answer questions like:

I choose a real number between 0 and 1 uniformly, what is the probability of

getting a rational number? Answer: 0!

If every single point has probability 0 how can I have anything interesting?

Is it possible to assign a probability to every set?

Answer: No!

Page 186: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Where can we go from here?

Continuous state spaces: robotics, telecommunication,

control systems, sensor systems.

Requires measure theory and analysis. Why?

It is subtle to answer questions like:

I choose a real number between 0 and 1 uniformly, what is the probability of

getting a rational number? Answer: 0!

If every single point has probability 0 how can I have anything interesting?

Is it possible to assign a probability to every set?

Answer: No!

There is a rich and fascinating theory of programming

and reasoning about probabilistic systems.

Page 187: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Logic and Probability are your weapons.

Go forth and conquer the software world!

Page 188: Probability in Programmingprakash/Talks/tsd.pdf · 2015-05-14 · Why does one use probability? • Some algorithms use probability as a computational resource: randomized algorithms.

Thank you!