ECE-517: Reinforcement Learning in Artificial...

1

Lecture 3: Review of basic probability theory

Dr. Itamar Arel

College of EngineeringElectrical Engineering and Computer Science Department

The University of TennesseeFall 2015

August 27, 2015

ECE-517: Reinforcement Learning in Artificial Intelligence

ECE-517 - Reinforcement Learning in AI 2

Outline

Probability theory fundamentals

Random variables


Basic definitions

The collection or set of "all possible" distinct outcomes of an experiment is called the sample space of theexperiment/trial. Flipping a coin {H,T}

Rolling a die {1,2,3,4,5,6}

Outcomes – Elements of the sample space

Event - The possible outcome of a experiment/trial

An Experiment

The Sample Space


More definitions

Independence Two experiments are independent if the outcome

of either one does not depend on the outcome of the other

Deterministic - outcome of a trial is predictable (100%)Randomness The absence of any patternA sample space is called discrete if it is a finiteor a countable infinite set, otherwise it is calledcontinuousProbability can be viewed as the likelihood of anevent occurring


Fundamentals

Let S denote the sample space and Ai the set of all possible outcomes with probabilities P(Ai), respectively

P(Ai) 0 for all i

P(Ai) =1

For example, a probabilistic model might present the length of a packet sent over a network

Two events A and B are called mutually exclusive, or disjoint, if they have no common outcomes

P(A + B) = P(A) + P(B) - P(AB)

Often, P(A + B) = P(A) + P(B)

A

B

AB


Outline

Probability theory fundamentals

Random variables


Discrete Random Variables

A random variable (r.v.) is a function that assigns a real number to each outcome in the sample space of a random experiment

For a discrete r.v. X, the probability mass function (PMF) gives

the probability that X will take on a particular value in its range.

We note this by PX, i.e.

PX(x) = P(X=x)

The expected value of a discrete r.v. X is defined by

E[X] = x PX(x)

The variance of X is defined as

E(X -E[X])2 = E[X 2]-E[X]2

Question: in what scenario will the variance be zero ?


Bernoulli and Geometric Random Variables with parameter p

X is a Bernoulli r.v. with parameter p if it can take on values 1 (success) and 0 (failure) with

P(x=1)=p

P(x=0)=1-p

Example: Packet arrivals may be modeled as either correct (1) or erroneous (0)

Given a sequence of independent Bernoulli r.v.’s, let T be the number of successes observed up to and including the first. Then T will have a geometric distribution; its PMF is given by

P(T=n)=(1-p)n-1p

E[T]=1/p


Memoryless property – the fact that there were n time steps separating success events has no influence on future events

The memoryless property makes it very useful in various analysis tasks

Geometric distribution with N = 16, p = 0.3


Binomial Random Variable with parameters p and n

Let S denote the number of successes out of nindependent Bernoulli r.v.’s. The PMF is given by

for k = 0,1,…, n.

The expected number of successes is given by

Example: if packets arrive correctly at a node in a network

with probability p (independently); then the number of

correct arrivals out of n is a Binomial r.v.

knk ppk

nkSP

)1()(

E[S] = np


Mean of a Binomial r.v.

Note that:


Examples (from ECE-453)

Consider the following network. Packets transmitted from Router Ato Router B have a packet error rate (PER) of pAB, while packetstransmitted from Router B to Router C have a PER of pBC. The packeterror rates are assumed to be independent.

pAB

A B C

pBC

• If all traffic from Router A to Router C traverses Router B, what is the probabilitythat all N packets transmitted from Router A to Router C are received correctly?

• Given that N packets were transmitted from Router A to Router B, write anexpression for the probability that at least m of those N packets are receivedcorrectly.

• Assuming Router A has transmitted N packets to Router C (via Router B), what isthe probability that exactly m packets (where m<N) are received correctly at RouterC?


CDF and PDF

The Cumulative Distribution Function (cdf) of a r.v. X,

FX(x), is defined as the probability of the event {X x}

Axioms related are:

The probability density function (pdf) of a r.v. X, fX(x), is

defined as the derivative of the CDF

xxXPxFX ],[)(

)()( ba

0)( lim ,1)( lim ,1)(0xx

bFaFthenif

xFxFxF

XX

XXX

k}{X(k)Pdx

xdFxf X

XX Pr

)()(


Exponential Distribution

Continuous random variable

Continuous-time analogy to the geometric distribution (memoryless properties hold)

Models lifetime, inter-arrival times,…


Minimum of Independent Exponential rvs

Assume X1, X2, …, Xn, are Independent Exponentials


Memoryless Property

True for Geometric and Exponential Dist.:

The coin does not remember that it came up tails l times

Root cause of Markov property (discussed later)


Useful Results

The following are some results that are useful for manipulating many of the equations that may arise when dealing with discrete-time probabilistic models

when |x|<1,

Differentiating both sides of the previous equation yields another useful expression:

n

k

nk

x

xx

0

1

1

1

0 1

1

k

k

xx

02)1(k

k

x

xkx

ECE-517: Reinforcement Learning in Artificial...

Documents

Transcript of ECE-517: Reinforcement Learning in Artificial...