Lecture 2 Elements of Probability...
Transcript of Lecture 2 Elements of Probability...
Lecture 2Elements of Probability Theory
Americo Barbosa da Cunha Junior
Universidade do Estado do Rio de Janeiro – UERJ
NUMERICO – Nucleus of Modeling and Experimentation with Computers
http://numerico.ime.uerj.br
www.americocunha.org
Uncertainties 2018April 8, 2018
Florianopolis - SC, Brazil
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 1 / 47
Probability in Dimension 1
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 2 / 47
Random experiment
An experiment which repeated under same fixed conditions producedifferent results is called random experiment.
Examples:
1 Rolling a cube-shaped fare die
2 Choosing an integer even number randomly
3 Measuring temperature
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 3 / 47
Probability space
The mathematical framework in which a random experiment is de-scribed consists of a triplet (Ω,Σ,P), called probability space.
The elements of a probability space are:
Ω: sample space(set with all possible events)
Σ: σ-algebra on Ω(set with relevant events only)
P: probability measure(measure of expectation of an event occurrence)
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 4 / 47
Sample space
A non-empty set which contains all possible events for a certainrandom experiment is called sample space, being represented by Ω.
Examples:
1 Rolling a cube-shaped fare die (finite Ω)
Ω = 1, 2, 3, 4, 5, 62 Choosing an integer even number randomly (denumerable Ω)
Ω = · · · ,−8,−6,−4,−2, 0, 2, 4, 6, 8, · · · 3 Measuring temperature in Kelvin (non-denumerable Ω)
Ω = [a, b] ⊂ [0,+∞)
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 5 / 47
σ-algebra of events
In general, not all of the events in Ω are of interest.
Intuitively, a σ-algebra on Ω is the set of relevant outcomes for arandom experiment. Formally, Σ is a σ-algebra on Ω if
φ ∈ Σ(contains the empty set)
Ac ∈ Σ for any A ∈ Σ(closed under complementation)∞⋃i=1
Ai ∈ Σ for any Ai ∈ Σ
(closed under denumerable unions)
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 6 / 47
Probability measure
A probability measure is a function P : Σ→ [0, 1] ⊂ R such that
P A ≥ 0 for any A ∈ Σ(probability is nonnegative)
P Ω = 1(entire space has probability one)
P
∞⋃i=1
Ai
=∞∑i=1
P Ai for any Ai mutually disjoint
(σ-additivity)
Remark:P φ = 0 (empty set has probability zero)
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 7 / 47
An example in continuous probability
A point is randomly choosen in a square (side b).
Event A: point lie above main diagonal
Event B: point lie in main diagonal
Event C: point lie outside main diagonal
P A =area of upper triangle
area of square=
0.5 b2
b2=
1
2
P B =area of main diagonal
area of square=
0
b2= 0
P C =area outside main diagonal
area of square=
b2 − 0
b2= 1
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 8 / 47
An example in continuous probability
A point is randomly choosen in a square (side b).
Event A: point lie above main diagonal
Event B: point lie in main diagonal
Event C: point lie outside main diagonal
P A =area of upper triangle
area of square=
0.5 b2
b2=
1
2
P B =area of main diagonal
area of square=
0
b2= 0
P C =area outside main diagonal
area of square=
b2 − 0
b2= 1
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 8 / 47
An example in continuous probability
A point is randomly choosen in a square (side b).
Event A: point lie above main diagonal
Event B: point lie in main diagonal
Event C: point lie outside main diagonal
P A =area of upper triangle
area of square=
0.5 b2
b2=
1
2
P B =area of main diagonal
area of square=
0
b2= 0
P C =area outside main diagonal
area of square=
b2 − 0
b2= 1
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 8 / 47
An example in continuous probability
A point is randomly choosen in a square (side b).
Event A: point lie above main diagonal
Event B: point lie in main diagonal
Event C: point lie outside main diagonal
P A =area of upper triangle
area of square=
0.5 b2
b2=
1
2
P B =area of main diagonal
area of square=
0
b2= 0
P C =area outside main diagonal
area of square=
b2 − 0
b2= 1
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 8 / 47
Remarks on probability
Probability zero
An impossible event has probability zero(e.g. roll a six faces dice, numbered from 1 to 6, and get 7)
Not every event with probability zero is impossible(e.g. randomly pick a point on the main diagonal of a square)
Probability one
An event which occurrence is certain has probability one(e.g. throw a coin and obtain head or tail)
Not every event with probability one occurs(e.g. randomly pick a point outside square’s main diagonal)
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 9 / 47
Conditional probability
Consider a pair of random events A and B such that P B > 0.
The conditional probability of A, given the occurrence of B, denoted
as PA|B
, is defined as
PA | B
=P A ∩ BP B
.
It follows that
P A ∩ B = PA | B
× P B .
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 10 / 47
An example in conditional probability
Somebody rolls a pair of six-sided dice.
A = value rolled on die 1
B = value rolled on die 2
What is the probability that A = 2 given that A + B ≤ 5?
A=2
P A = 6/36
A+B≤5
P A+B≤5 = 10/36
A |A+B≤5
P A |A+B≤5 = 3/10
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 11 / 47
Conditional probability — Wikipedia, The Free Encyclopedia, 2017.
An example in conditional probability
Somebody rolls a pair of six-sided dice.
A = value rolled on die 1
B = value rolled on die 2
What is the probability that A = 2 given that A + B ≤ 5?A=2
P A = 6/36
A+B≤5
P A+B≤5 = 10/36
A |A+B≤5
P A |A+B≤5 = 3/10
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 11 / 47
Conditional probability — Wikipedia, The Free Encyclopedia, 2017.
An example in conditional probability
Somebody rolls a pair of six-sided dice.
A = value rolled on die 1
B = value rolled on die 2
What is the probability that A = 2 given that A + B ≤ 5?A=2
P A = 6/36
A+B≤5
P A+B≤5 = 10/36
A |A+B≤5
P A |A+B≤5 = 3/10
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 11 / 47
Conditional probability — Wikipedia, The Free Encyclopedia, 2017.
An example in conditional probability
Somebody rolls a pair of six-sided dice.
A = value rolled on die 1
B = value rolled on die 2
What is the probability that A = 2 given that A + B ≤ 5?A=2
P A = 6/36
A+B≤5
P A+B≤5 = 10/36
A |A+B≤5
P A |A+B≤5 = 3/10
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 11 / 47
Conditional probability — Wikipedia, The Free Encyclopedia, 2017.
An example in conditional probability
Somebody rolls a pair of six-sided dice.
A = value rolled on die 1
B = value rolled on die 2
What is the probability that A = 2 given that A + B ≤ 5?A=2
P A = 6/36
A+B≤5
P A+B≤5 = 10/36
A |A+B≤5
P A |A+B≤5 = 3/10
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 11 / 47
Conditional probability — Wikipedia, The Free Encyclopedia, 2017.
An example in conditional probability
Somebody rolls a pair of six-sided dice.
A = value rolled on die 1
B = value rolled on die 2
What is the probability that A = 2 given that A + B ≤ 5?A=2
P A = 6/36
A+B≤5
P A+B≤5 = 10/36
A |A+B≤5
P A |A+B≤5 = 3/10
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 11 / 47
Conditional probability — Wikipedia, The Free Encyclopedia, 2017.
An example in conditional probability
Somebody rolls a pair of six-sided dice.
A = value rolled on die 1
B = value rolled on die 2
What is the probability that A = 2 given that A + B ≤ 5?A=2
P A = 6/36
A+B≤5
P A+B≤5 = 10/36
A |A+B≤5
P A |A+B≤5 = 3/10
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 11 / 47
Conditional probability — Wikipedia, The Free Encyclopedia, 2017.
Independence of events
If the occurrence of an event B does not affect the occurrence of anevent A one has
PA | B
= P A .
In this way, once P A ∩ B = PA | B
× P B it is true that
P A ∩ B = P A × P B .
Events A and B in which the latter holds are said to be independent.
Remark:This notion generalizes itself naturally to n events.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 12 / 47
An example in events independence
A card is drawn from a deck with 52 unknown cards.
Event 1: Q “queen”Event 2: ♠“spade”Are these events independent?
Probability space 2 (unfair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,
P2 Q = 28/52P2 ♠ = 1/2P2 Q♠ = 1/2 6= 28/52︸ ︷︷ ︸
P2Q
× 1/2︸︷︷︸P2♠
Events are not independent.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 13 / 47
An example in events independence
A card is drawn from a deck with 52 unknown cards.
Event 1: Q “queen”Event 2: ♠“spade”Are these events independent?
Probability space 1 (fair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♣,3♣,4♣,5♣,6♣,7♣,8♣,9♣,10♣,J♣,Q♣,K♣,A♣,2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,2♠,3♠,4♠,5♠,6♠,7♠,8♠,9♠,10♠,J♠,Q♠,K♠,A♠
P1 Q = 4/52P1 ♠ = 13/52P1 Q♠ = 1/52 = 4/52︸︷︷︸
P1Q
× 13/52︸ ︷︷ ︸P1♠
Events are independent.
Probability space 2 (unfair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,
P2 Q = 28/52P2 ♠ = 1/2P2 Q♠ = 1/2 6= 28/52︸ ︷︷ ︸
P2Q
× 1/2︸︷︷︸P2♠
Events are not independent.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 13 / 47
An example in events independence
A card is drawn from a deck with 52 unknown cards.
Event 1: Q “queen”Event 2: ♠“spade”Are these events independent?
Probability space 1 (fair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♣,3♣,4♣,5♣,6♣,7♣,8♣,9♣,10♣,J♣,Q♣,K♣,A♣,2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,2♠,3♠,4♠,5♠,6♠,7♠,8♠,9♠,10♠,J♠,Q♠,K♠,A♠
P1 Q = 4/52
P1 ♠ = 13/52P1 Q♠ = 1/52 = 4/52︸︷︷︸
P1Q
× 13/52︸ ︷︷ ︸P1♠
Events are independent.
Probability space 2 (unfair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,
P2 Q = 28/52P2 ♠ = 1/2P2 Q♠ = 1/2 6= 28/52︸ ︷︷ ︸
P2Q
× 1/2︸︷︷︸P2♠
Events are not independent.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 13 / 47
An example in events independence
A card is drawn from a deck with 52 unknown cards.
Event 1: Q “queen”Event 2: ♠“spade”Are these events independent?
Probability space 1 (fair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♣,3♣,4♣,5♣,6♣,7♣,8♣,9♣,10♣,J♣,Q♣,K♣,A♣,2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,2♠,3♠,4♠,5♠,6♠,7♠,8♠,9♠,10♠,J♠,Q♠,K♠,A♠
P1 Q = 4/52P1 ♠ = 13/52
P1 Q♠ = 1/52 = 4/52︸︷︷︸P1Q
× 13/52︸ ︷︷ ︸P1♠
Events are independent.
Probability space 2 (unfair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,
P2 Q = 28/52P2 ♠ = 1/2P2 Q♠ = 1/2 6= 28/52︸ ︷︷ ︸
P2Q
× 1/2︸︷︷︸P2♠
Events are not independent.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 13 / 47
An example in events independence
A card is drawn from a deck with 52 unknown cards.
Event 1: Q “queen”Event 2: ♠“spade”Are these events independent?
Probability space 1 (fair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♣,3♣,4♣,5♣,6♣,7♣,8♣,9♣,10♣,J♣,Q♣,K♣,A♣,2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,2♠,3♠,4♠,5♠,6♠,7♠,8♠,9♠,10♠,J♠,Q♠,K♠,A♠
P1 Q = 4/52P1 ♠ = 13/52P1 Q♠ = 1/52
= 4/52︸︷︷︸P1Q
× 13/52︸ ︷︷ ︸P1♠
Events are independent.
Probability space 2 (unfair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,
P2 Q = 28/52P2 ♠ = 1/2P2 Q♠ = 1/2 6= 28/52︸ ︷︷ ︸
P2Q
× 1/2︸︷︷︸P2♠
Events are not independent.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 13 / 47
An example in events independence
A card is drawn from a deck with 52 unknown cards.
Event 1: Q “queen”Event 2: ♠“spade”Are these events independent?
Probability space 1 (fair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♣,3♣,4♣,5♣,6♣,7♣,8♣,9♣,10♣,J♣,Q♣,K♣,A♣,2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,2♠,3♠,4♠,5♠,6♠,7♠,8♠,9♠,10♠,J♠,Q♠,K♠,A♠
P1 Q = 4/52P1 ♠ = 13/52P1 Q♠ = 1/52 = 4/52︸︷︷︸
P1Q
× 13/52︸ ︷︷ ︸P1♠
Events are independent.
Probability space 2 (unfair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,
P2 Q = 28/52P2 ♠ = 1/2P2 Q♠ = 1/2 6= 28/52︸ ︷︷ ︸
P2Q
× 1/2︸︷︷︸P2♠
Events are not independent.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 13 / 47
An example in events independence
A card is drawn from a deck with 52 unknown cards.
Event 1: Q “queen”Event 2: ♠“spade”Are these events independent?
Probability space 1 (fair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♣,3♣,4♣,5♣,6♣,7♣,8♣,9♣,10♣,J♣,Q♣,K♣,A♣,2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,2♠,3♠,4♠,5♠,6♠,7♠,8♠,9♠,10♠,J♠,Q♠,K♠,A♠
P1 Q = 4/52P1 ♠ = 13/52P1 Q♠ = 1/52 = 4/52︸︷︷︸
P1Q
× 13/52︸ ︷︷ ︸P1♠
Events are independent.
Probability space 2 (unfair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,
P2 Q = 28/52P2 ♠ = 1/2P2 Q♠ = 1/2 6= 28/52︸ ︷︷ ︸
P2Q
× 1/2︸︷︷︸P2♠
Events are not independent.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 13 / 47
An example in events independence
A card is drawn from a deck with 52 unknown cards.
Event 1: Q “queen”Event 2: ♠“spade”Are these events independent?
Probability space 2 (unfair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,
P2 Q = 28/52P2 ♠ = 1/2P2 Q♠ = 1/2 6= 28/52︸ ︷︷ ︸
P2Q
× 1/2︸︷︷︸P2♠
Events are not independent.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 13 / 47
An example in events independence
A card is drawn from a deck with 52 unknown cards.
Event 1: Q “queen”Event 2: ♠“spade”Are these events independent?
Probability space 2 (unfair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,
P2 Q = 28/52P2 ♠ = 1/2P2 Q♠ = 1/2 6= 28/52︸ ︷︷ ︸
P2Q
× 1/2︸︷︷︸P2♠
Events are not independent.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 13 / 47
An example in events independence
A card is drawn from a deck with 52 unknown cards.
Event 1: Q “queen”Event 2: ♠“spade”Are these events independent?
Probability space 2 (unfair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,
P2 Q = 28/52
P2 ♠ = 1/2P2 Q♠ = 1/2 6= 28/52︸ ︷︷ ︸
P2Q
× 1/2︸︷︷︸P2♠
Events are not independent.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 13 / 47
An example in events independence
A card is drawn from a deck with 52 unknown cards.
Event 1: Q “queen”Event 2: ♠“spade”Are these events independent?
Probability space 2 (unfair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,
P2 Q = 28/52P2 ♠ = 1/2
P2 Q♠ = 1/2 6= 28/52︸ ︷︷ ︸P2Q
× 1/2︸︷︷︸P2♠
Events are not independent.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 13 / 47
An example in events independence
A card is drawn from a deck with 52 unknown cards.
Event 1: Q “queen”Event 2: ♠“spade”Are these events independent?
Probability space 2 (unfair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,
P2 Q = 28/52P2 ♠ = 1/2P2 Q♠ = 1/2
6= 28/52︸ ︷︷ ︸P2Q
× 1/2︸︷︷︸P2♠
Events are not independent.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 13 / 47
An example in events independence
A card is drawn from a deck with 52 unknown cards.
Event 1: Q “queen”Event 2: ♠“spade”Are these events independent?
Probability space 2 (unfair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,
P2 Q = 28/52P2 ♠ = 1/2P2 Q♠ = 1/2 6= 28/52︸ ︷︷ ︸
P2Q
× 1/2︸︷︷︸P2♠
Events are not independent.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 13 / 47
An example in events independence
A card is drawn from a deck with 52 unknown cards.
Event 1: Q “queen”Event 2: ♠“spade”Are these events independent?
Probability space 2 (unfair deck):
cards =
2♦,3♦,4♦,5♦,6♦,7♦,8♦,9♦,10♦,J♦,Q♦,K♦,A♦,
2♥,3♥,4♥,5♥,6♥,7♥,8♥,9♥,10♥,J♥,Q♥,K♥,A♥,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,Q♠,
P2 Q = 28/52P2 ♠ = 1/2P2 Q♠ = 1/2 6= 28/52︸ ︷︷ ︸
P2Q
× 1/2︸︷︷︸P2♠
Events are not independent.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 13 / 47
Further remarks on probability
The last example shows that:
Different probability spaces, for the same random experiment,can produce different predictions
A probability space that does not accurately describe a randomevent can produce completely erroneous predictions
The notion of independence strongly depends on the probabilitymeasure employed
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 14 / 47
Random variable
A mapping X : Ω→ R is called a random variable (RV) if thepreimage of every real number under X is a relevant event, i.e.,
X−1(x) =ω ∈ Ω : X (ω) ≤ x
∈ Σ, for every x ∈ R.
A collection of events in Ω ismapped to an interval E on thereal line under such mapping.
RV are numerical characteristics of interesting events.
Remark:A random variable is a function from Ω to R, not a real number.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 15 / 47
*Picture from http://theanalysisofdata.com/probability/2_1.htmll
Examples of random variables
1 Rolling die experiment
Ω = 1, 2, 3, 4, 5, 6
X1(ω) =
1 if ω is even
0 if ω is odd
(random variable)
Σ =φ, 1, 3, 5, 2, 4, 6,Ω
X2(ω) = ω2
(not a random variable)
2 Temperature (in Kelvin) measurement experiment
Ω = [a, b] ⊂ [0,+∞)
X (ω) = −459.67 + 1.8ω(random variable)
Σ = B[a,b] (Borel σ-algebra)
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 16 / 47
G. Grimmett and D. Welsh, Probability: An Introduction. Oxford University Press, 2 edition, 2014.
Probability distribution
The probability distribution of X , denoted by FX , is defined as theprobability of the elementary event X ≤ x, i.e.,
FX (x) = P X ≤ x .
FX is also known as cumulative distribution function (CDF) and hasthe following properties:
0 ≤ FX (x) ≤ 1
FX is a monotonic, non decreasing, right continuous function
Px1 < X ≤ x2 = FX (x2)− FX (x1) =
∫ x2
x1
dFX (x)∫RdFX (x) = 1
FX (−∞) = 0 and FX (+∞) = 1
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 17 / 47
Probability density function
If the function FX is differentiable, its derivative pX (x) = dFX (x)/dxis called probability density function (PDF) of X , and one has
FX (x) =
∫ x
−∞pX (ξ) dξ.
Note also that:
pX (x) ≥ 0 for every x ∈ R∫RpX (x) dx = 1
Remark:Intuitively, pX (x) dx can be thought of as the probability of Xfalling within the infinitesimal interval [x , x + dx ].
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 18 / 47
Types of random variables
1 Discrete random variableDistribution is discontinuous.Assumes a denumerable number of values.Typically associated with counting processes.
2 Continuous random variableDistribution is continuous.Assumes a non-denumarable number of values.Typically associated with measuring processes.
3 Mixed random variableDistribution has points of discontinuity.Assumes a non-denumarable number of values.A “mixture” of the two previous types.
4 Singular random variableDistribution is different from the other types.It has theoretical interest only.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 19 / 47
Function of random variable
The function of random variable Y = h (X ), for a random vari-able X and measurable mapping h : R → R, is also a randomvariable, which has its own probability distribution.
Example:
Let h(x) = x2 and X : Ω→ [1, 2] be a random variable such that
FX (x) =
0 if x < 1
1/2 if 1 ≤ x ≤ 2
1 if x > 2.
The composition Y = X 2 is a random variable with distribution
FY (y) =
0 if y < 1
1/2√y if 1 ≤ y ≤ 4
1 if y > 4.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 20 / 47
Mathematical expectation operator
The mathematical expectation of a random variable X is defined as
E X =
∫Rx dFX (x).
The mathematical expectation is a linear operator since
E α1 X1 + α2 X2 = α1 E X1+ α2 E X2 ,
for any pairs of number α1, α2 and random variables X1,X2.
Theorem (law of the unconscious statistician):
Given a measurable mapping h : R → R and a random variable Xthe expected value of h (X ) is given by
Eh (X )
=
∫Rh (x) dFX (x).
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 21 / 47
Mean value
The mean value of the random variable X is defined as
µX = E X
=
∫Rx dFX (x)
=
∫Rx pX (x) dx .
(measure of the central tendency)
Remark:The mean value µX is the constant which best approximate therandom variable X . The error of this approximation is the standarddeviation σX .
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 22 / 47
Variance
The variance of the random variable X is defined as
σ2X = E
(X − µX )2
=
∫R
(x − µX )2 dFX (x)
=
∫R
(x − µX )2 pX (x) dx .
(measure of dispersion about the mean)
Note that variance can also be written as
σ2X = E
X 2−(E X
)2.
Remark:σ2X has the same unit as X 2.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 23 / 47
Standard deviation and variation coefficient
Other second-order statistics of X are the standard deviation
σX =√σ2X ,
and the variation coefficient
δX = σX/µX , µX 6= 0.
(both are measures of dispersion about the mean)
Remark:σX has the same unit as X and δX is dimensionless.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 24 / 47
Probability Distributions
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 25 / 47
Uniform
Notation: U(a, b)
Support: [a, b]
Parameters:
−∞ < a < b < +∞ — boundaries
PDF:
pX (x) =1
b − a1[a,b](x)
Statistics:
µ =1
2(a + b)
σ2 =1
12(b − a)2
Probability Density Function
Cumulative Distribution Function
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 26 / 47
Uniform distribution — Wikipedia, The Free Encyclopedia, 2017.
Gamma
Notation: Gamma(k, θ)
Support: (0,+∞)
Parameters:
k — shape parameterθ — scale parameter
PDF:
pX (x) =1
Γ(k) θkxk−1 e(−x/θ) 1(0,+∞)(x)
Statistics:
µ = k θσ2 = k θ2
Probability Density Function
Cumulative Distribution Function
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 27 / 47
Gamma distribution — Wikipedia, The Free Encyclopedia, 2017.
Gaussian
Notation: N (µ, σ2)
Support: (−∞,+∞)
Parameters:
µ — meanσ2 — variance
PDF:
pX (x) =1√
2π σ2exp
−(x − µ)2
2σ2
Probability Density Function
Cumulative Distribution Function
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 28 / 47
Normal distribution — Wikipedia, The Free Encyclopedia, 2017.
Characterization of a probability distribution
Given the mean µ and standard deviation σ of a random variable.Is the distribution well-defined?
No! Multiple distributions may have the same statistics.
pX (x) =1√6π
exp
−(x − 1)2
6
and pX (x) =
1
61[−2,4](x)
have µ = 1 and σ =√
3.
What type of information determines a distribution?
cumulative distribution function
probability density function (if exists)
quantile function
characteristic function
moment-generating function (if exists)
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 29 / 47
Characterization of a probability distribution
Given the mean µ and standard deviation σ of a random variable.Is the distribution well-defined?
No! Multiple distributions may have the same statistics.
pX (x) =1√6π
exp
−(x − 1)2
6
and pX (x) =
1
61[−2,4](x)
have µ = 1 and σ =√
3.
What type of information determines a distribution?
cumulative distribution function
probability density function (if exists)
quantile function
characteristic function
moment-generating function (if exists)
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 29 / 47
Characterization of a probability distribution
Given the mean µ and standard deviation σ of a random variable.Is the distribution well-defined?
No! Multiple distributions may have the same statistics.
pX (x) =1√6π
exp
−(x − 1)2
6
and pX (x) =
1
61[−2,4](x)
have µ = 1 and σ =√
3.
What type of information determines a distribution?
cumulative distribution function
probability density function (if exists)
quantile function
characteristic function
moment-generating function (if exists)
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 29 / 47
Characterization of a probability distribution
Given the mean µ and standard deviation σ of a random variable.Is the distribution well-defined?
No! Multiple distributions may have the same statistics.
pX (x) =1√6π
exp
−(x − 1)2
6
and pX (x) =
1
61[−2,4](x)
have µ = 1 and σ =√
3.
What type of information determines a distribution?
cumulative distribution function
probability density function (if exists)
quantile function
characteristic function
moment-generating function (if exists)
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 29 / 47
Characterization of a probability distribution
Given the mean µ and standard deviation σ of a random variable.Is the distribution well-defined?
No! Multiple distributions may have the same statistics.
pX (x) =1√6π
exp
−(x − 1)2
6
and pX (x) =
1
61[−2,4](x)
have µ = 1 and σ =√
3.
What type of information determines a distribution?
cumulative distribution function
probability density function (if exists)
quantile function
characteristic function
moment-generating function (if exists)
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 29 / 47
Probability in Dimension n
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 30 / 47
Random vector
Let x = (x1, · · · , xn) ∈ Rn. A random vector X = (X1, · · · ,Xn) is acollection of n random variables Xi : Ω → R that together may beconsidered a (measurable) mapping X : Ω→ Rn.
A collection of event in Ωis mapped into a region onthe Euclidean space under suchmapping.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 31 / 47
*Picture from http://theanalysisofdata.com/probability/4_1.html
Joint probability distribution
The joint probability distribution of random vector X = (X1, · · · ,Xn),denoted by FX, is defined as
FX(x1, · · · , xn) = PX1 ≤ x1 ∩ · · · ∩ Xn ≤ xn
.
Thus,
P a < X ≤ b =
∫ b1
a1
· · ·∫ bn
an
dFX (x1, · · · , xn),
in which a < X ≤ b = a1 < X1 ≤ b1 ∩ · · · ∩ an < Xn ≤ bn.
FX is also known as joint cumulative distribution function.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 32 / 47
Joint probability density function
If pX(x1, · · · , xn) = ∂n FX/∂x1 · · · ∂xn exists, for any x1, · · · , xn,then it is called joint probability density function of X, and one has
FX(x1, · · · , xn) =
∫ x1
−∞· · ·∫ xn
−∞pX(ξ1, · · · , ξn) dξ1 · · · dξn.
Note also that:
pX(x1, · · · , xn) ≥ 0 for every (x1, · · · , xn) ∈ Rn∫ +∞
−∞· · ·∫ +∞
−∞pX(x1, · · · , xn) dx1 · · · dxn = 1
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 33 / 47
Marginal probability density function
The marginal probability density function of Xi is defined as
pXi(xi ) =
∫ +∞
−∞· · ·∫ +∞
−∞︸ ︷︷ ︸n−1 times
pX(x1, · · · , xn) dx1 · · · dxi−1 dxi+1 · · · dxn,
for i = 1, · · · , n.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 34 / 47
Conditional distribution
Consider a pair of jointly distributed random variables X and Y .The conditional distribution of X , given the occurrence of the valuey of Y , is defined as
FX |Y (x | y) =FX ,Y (x , y)
FY (y).
Thus
FX ,Y (x , y) = FX |Y (x |y)× FY (y),
and
pX ,Y (x , y) = pX |Y (x |y)× pY (y).
Remark:This definition extends naturally to the n-dimensional case.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 35 / 47
Independence of distributions
The random variables X and Y are said to be independent if therealization of X does not affect the probability distribution of Y ,i.e.,
FX |Y (x |y) = FX (x).
Therefore, for independent random variable one has
FX Y (x , y) = FX (x)× FY (y),
and
pX Y (x , y) = pX (x)× pY (y).
Remark:This definition extends naturally to the n-dimensional case.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 36 / 47
Statistics of random vectors
second-order random vector
E‖ X ‖2
=
∫Rn
‖ x ‖2 dFX(x) < +∞
mean vector
mX = E X =
∫Rn
x dFX(x) ∈ Rn
correlation matrix
[RXY] = E
XYT
=
∫Rn
xxT dFXY(x, y) ∈ Rn×n
covariance matrix
[KXY] = E
(X−mX) (Y−mY)T∈ Rn×n
= [RXY]−mXmTY
Remark:Matrices [RXY] and [KXY] are symetric positive semi-definitewhen X = Y.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 37 / 47
Correlation of random variables
The random vectors X = (X1, · · · ,Xn) and Y = (Y1, · · · ,Yn) aresaid to be uncorrelated if covariance matrix [KXY] is null, i.e.,
[RXY] = mXmTY .
If two random vectors are independent, then they are uncorrelated.
independence =⇒ uncorrelation
But uncorrelated random vectors are not independent in general.
uncorrelation 6=⇒ independence
Remark:Uncorrelated random vectors which the joint distribution is Gaussianare independent.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 38 / 47
Notions of Random Processes
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 39 / 47
Random process
A real-valued random process (also called stochastic process) definedon probability space (Ω,Σ,P), indexed by t ∈ T , is a mapping
(t, ω) ∈ T × Ω→ X (t, ω) ∈ R,
such that, for fixed t, the output is a random variable X (t, ·), whilefor fixed ω, X (·, ω) is a function of t (sample function).
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 40 / 47
*Picture from https://www.vocal.com/noise-reduction/statistical-analysis-random-signals/
Interpretation and classification
Note that, by definition, a real-valued random process X (t, ω) is acollection of real-valued random variables indexed by a parameter,and can be thought of as a time-dependent random variable.
According to the nature of the set of indices T , a stochastic processis classified as:
discrete – if T is countable
continuous – if T is uncountable
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 41 / 47
Wiener process (Brownian motion process)
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 42 / 47
Wiener process — Wikipedia, The Free Encyclopedia, 2017.
White noise and colored noise
White noiseColored noise
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 43 / 47
White noise — Wikipedia, The Free Encyclopedia, 2017.
Finite-dimensional distribution
A random process Xt = X (t, ω) is statistically defined to the order nby a set of its n-th order joint probability distribution, i.e.,
FXt (x1, · · · , xn) = P(Xt1 ≤ x1, · · · ,Xtn ≤ xn
),
which is the joint CDF of random variables Xt1 , · · · ,Xtn .
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 44 / 47
Statistics of a random process
second-order random process
EX 2t
=
∫Rx2 dFXt (x) < +∞
mean function
µX (t) = E Xt =
∫Rx dFXt (x)
correlation function
corrX (t1, t2) = EXt1Xt2
=
∫Rx1 x2 dFXt1Xt2
(x1, x2)
covariance function
covX (t1, t2) = E(
Xt1 − µX (t1)) (
Xt2 − µX (t2))
= corrX (t1, t2)− µX (t1)µX (t2)
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 45 / 47
Stationary process
A stochastic process is stationary when all the random variables ofthat stochastic process are identically distributed.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 46 / 47
Stationary process — Wikipedia, The Free Encyclopedia, 2017.
References
G. Grimmett and D. Welsh, Probability: An Introduction. Oxford University Press, 2 edition, 2014.
J. Jacod and P. Protter, Probability Essentials. Springer, 2nd edition, 2004.
A. Klenke, Probability Theory: A Comprehensive Course. Springer, 2nd edition, 2014.
A. Papoulis and S. U. Pillai, Probability, Random Variables and Stochastic Processes. McGraw-HillEurope; 4th edition, 2002.
c© A. Cunha Jr (UERJ) Uncertainty Quantification in Computational Predictive Science 47 / 47