Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

download Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

of 36

Transcript of Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    1/36

    UNIT IIINFORMATION THEORY, CODING & CRYPTOGRAPHY (MCSE 202)

    PREPARED BY ARUN PRATAP SINGH 5/26/14 MTECH2nd SEMESTER

  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    2/36

    PREPARED BY ARUN PRATAP SINGH 1

    1

    STOCHASTIC PROCESS :

    In probability theory, a stochastic process or sometimes random process (widely used) is a

    collection ofrandom variables;this is often used to represent the evolution of some random value, or

    system, over time. This is the probabilistic counterpart to a deterministic process (or deterministic

    system). Instead of describing a process which can only evolve in one way (as in the case, for example,

    of solutions of an ordinary differential equation), in a stochastic or random process there is some

    indeterminacy: even if the initial condition (or starting point) is known, there are several (often infinitely

    many) directions in which the process may evolve.

    In the simple case ofdiscrete time,as opposed to continuous time,a stochastic process involves

    a sequence of random variables and the time series associated with these random variables (for

    example, see Markov chain, also known as discrete-time Markov chain). Another basic type of a

    stochastic process is arandom field,whose domain is a region ofspace,in other words, a random

    function whose arguments are drawn from a range of continuously changing values. One approach to

    stochastic processes treats them asfunctions of one or several deterministic arguments (inputs, in

    most cases regarded as time) whose values (outputs) arerandom variables:non-deterministic (single)

    quantities which have certain probability distributions. Random variables corresponding to various

    times (or points, in the case of random fields) may be completely different. The main requirement is

    that these different random quantities all have the same type. Type refers to the codomain of the

    function. Although the random values of a stochastic process at different times may beindependent

    random variables, in most commonly considered situations they exhibit complicated statistical

    correlations.

    Stock market fluctuations have been modeled by stochastic processes.

    UNIT : II

    http://en.wikipedia.org/wiki/Probability_theoryhttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Deterministic_systemhttp://en.wikipedia.org/wiki/Deterministic_systemhttp://en.wikipedia.org/wiki/Ordinary_differential_equationhttp://en.wikipedia.org/wiki/Discrete-time_stochastic_processhttp://en.wikipedia.org/wiki/Continuous-time_stochastic_processhttp://en.wikipedia.org/wiki/Sequence_(mathematics)http://en.wikipedia.org/wiki/Time_serieshttp://en.wikipedia.org/wiki/Markov_chainhttp://en.wikipedia.org/wiki/Random_fieldhttp://en.wikipedia.org/wiki/Spacehttp://en.wikipedia.org/wiki/Function_(mathematics)http://en.wikipedia.org/wiki/Random_variableshttp://en.wikipedia.org/wiki/Probability_distributionhttp://en.wikipedia.org/wiki/Codomainhttp://en.wikipedia.org/wiki/Statistical_independencehttp://en.wikipedia.org/wiki/Statistical_independencehttp://en.wikipedia.org/wiki/Stock_markethttp://en.wikipedia.org/wiki/Stock_markethttp://en.wikipedia.org/wiki/Statistical_independencehttp://en.wikipedia.org/wiki/Statistical_independencehttp://en.wikipedia.org/wiki/Codomainhttp://en.wikipedia.org/wiki/Probability_distributionhttp://en.wikipedia.org/wiki/Random_variableshttp://en.wikipedia.org/wiki/Function_(mathematics)http://en.wikipedia.org/wiki/Spacehttp://en.wikipedia.org/wiki/Random_fieldhttp://en.wikipedia.org/wiki/Markov_chainhttp://en.wikipedia.org/wiki/Time_serieshttp://en.wikipedia.org/wiki/Sequence_(mathematics)http://en.wikipedia.org/wiki/Continuous-time_stochastic_processhttp://en.wikipedia.org/wiki/Discrete-time_stochastic_processhttp://en.wikipedia.org/wiki/Ordinary_differential_equationhttp://en.wikipedia.org/wiki/Deterministic_systemhttp://en.wikipedia.org/wiki/Deterministic_systemhttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Probability_theory
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    3/36

    PREPARED BY ARUN PRATAP SINGH 2

    2

    Given aprobability space and ameasurable space , an S-valued stochastic

    process is a collection of S-valued random variables on , indexed by a totally

    ordered set T("time"). That is, a stochastic processXis a collection

    where each is an S-valued random variable on . The space S is then called the state

    spaceof the process.

    STATISTICAL INDEPENDENCE :

    In probability theory, to say that two events are independent (alternatively called statistically

    independentor stochastically independent)[1]means that the occurrence of one does not affect

    the probability of the other. Similarly, tworandom variables are independent if the realization of one

    does not affect the probability distribution of the other.

    In some instances, the term "independent" is replaced by "statistically independent",

    "marginally independent", or "absolutely independent"

    For events :

    Two events-

    Two eventsAand Bare independent if and only if their joint probability equals the product of

    their probabilities:

    http://en.wikipedia.org/wiki/Probability_spacehttp://en.wikipedia.org/wiki/Measurable_spacehttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Totally_orderedhttp://en.wikipedia.org/wiki/Totally_orderedhttp://en.wikipedia.org/wiki/Probability_theoryhttp://en.wikipedia.org/wiki/Event_(probability_theory)http://en.wikipedia.org/wiki/Statistical_independence#cite_note-Artificial_Intelligence-1http://en.wikipedia.org/wiki/Statistical_independence#cite_note-Artificial_Intelligence-1http://en.wikipedia.org/wiki/Statistical_independence#cite_note-Artificial_Intelligence-1http://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/Joint_probabilityhttp://en.wikipedia.org/wiki/Joint_probabilityhttp://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Statistical_independence#cite_note-Artificial_Intelligence-1http://en.wikipedia.org/wiki/Event_(probability_theory)http://en.wikipedia.org/wiki/Probability_theoryhttp://en.wikipedia.org/wiki/Totally_orderedhttp://en.wikipedia.org/wiki/Totally_orderedhttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Measurable_spacehttp://en.wikipedia.org/wiki/Probability_space
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    4/36

    PREPARED BY ARUN PRATAP SINGH 3

    3

    .

    Why this defines independence is made clear by rewriting withconditional probabilities:

    and similarly

    .

    Thus, the occurrence of Bdoes not affect the probability ofA, and vice versa. Although the derived

    expressions may seem more intuitive, they are not the preferred definition, as the conditional

    probabilities may be undefined if P(A) or P(B) are 0. Furthermore, the preferred definition makes

    clear by symmetry that whenAis independent of B, Bis also independent ofA.

    More than two events

    A finite set of events {Ai} is pairwise independentiff every pair of events is independent.[2]That

    is, if and only if for all distinct pairs of indices m, n

    .

    A finite set of events is mutually independent if and only if every event is independent of any

    intersection of the other events.[2]That is, iff for every subset {An}

    This is called the multiplication rulefor independent events.

    For more than two events, a mutually independent set of events is (by definition) pairwise

    independent, but the converse is not necessarily true.

    For random variables

    Two random variables

    Two random variablesXand Yare independentiff the elements of the-system generated by

    them are independent; that is to say, for every a and b, the events {X a} and {Y b} are

    independent events (as defined above). That is, X and Y with cumulative distribution

    functions and , andprobability densities and , are independentif

    and only if (iff) the combined random variable (X, Y) has ajoint cumulative distribution function

    or equivalently, a joint density

    http://en.wikipedia.org/wiki/Conditional_probabilityhttp://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/Statistical_independence#cite_note-Feller-2http://en.wikipedia.org/wiki/Statistical_independence#cite_note-Feller-2http://en.wikipedia.org/wiki/Statistical_independence#cite_note-Feller-2http://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/Statistical_independence#cite_note-Feller-2http://en.wikipedia.org/wiki/Statistical_independence#cite_note-Feller-2http://en.wikipedia.org/wiki/Statistical_independence#cite_note-Feller-2http://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/Pi_systemhttp://en.wikipedia.org/wiki/Pi_systemhttp://en.wikipedia.org/wiki/Pi_systemhttp://en.wikipedia.org/wiki/Cumulative_distribution_functionhttp://en.wikipedia.org/wiki/Cumulative_distribution_functionhttp://en.wikipedia.org/wiki/Probability_density_functionhttp://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/Joint_distributionhttp://en.wikipedia.org/wiki/Joint_distributionhttp://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/Probability_density_functionhttp://en.wikipedia.org/wiki/Cumulative_distribution_functionhttp://en.wikipedia.org/wiki/Cumulative_distribution_functionhttp://en.wikipedia.org/wiki/Pi_systemhttp://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/Statistical_independence#cite_note-Feller-2http://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/Statistical_independence#cite_note-Feller-2http://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/Conditional_probability
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    5/36

    PREPARED BY ARUN PRATAP SINGH 4

    4

    More than two random variables

    A set of random variables is pairwise independent iff every pair of random variables is

    independent.

    A set of random variables is mutually independentiff for any finite subset and

    any finite sequence of numbers , the events are

    mutually independent events (as defined above).

    The measure-theoretically inclined may prefer to substitute events {XA} for events {Xa} in

    the above definition, where A is anyBorel set.That definition is exactly equivalent to the one

    above when the values of the random variables arereal numbers.It has the advantage of working

    also for complex-valued random variables or for random variables taking values in

    anymeasurable space (which includestopological spaces endowed by appropriate -algebras).

    Conditional independence

    Intuitively, two random variablesXand Yare conditionally independent given Zif, once Zis known,

    the value of Y does not add any additional information about X. For instance, two

    measurements X and Y of the same underlying quantity Z are not independent, but they

    are conditionally independent given Z(unless the errors in the two measurements are somehow

    connected).

    The formal definition of conditional independence is based on the idea ofconditional distributions.

    IfX, Y, and Zarediscrete random variables,then we defineXand Yto beconditionally independent

    givenZif

    for allx, yand zsuch that P(Z= z) > 0. On the other hand, if the random variables arecontinuous and

    have a jointprobability density functionp, thenXand Yareconditionally independent given Zif

    for all real numbersx, yand zsuch thatpZ(z) > 0.

    IfXand Yare conditionally independent given Z, then

    for anyx, yand zwith P(Z= z) > 0. That is, the conditional distribution forXgiven Yand Zis the same

    as that given Zalone. A similar equation holds for the conditional probability density functions in the

    continuous case.

    Independence can be seen as a special kind of conditional independence, since probability can be

    seen as a kind of conditional probability given no events.

    http://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/Borel_algebrahttp://en.wikipedia.org/wiki/Real_numberhttp://en.wikipedia.org/wiki/Measurable_spacehttp://en.wikipedia.org/wiki/Topological_spacehttp://en.wikipedia.org/wiki/Conditional_distributionhttp://en.wikipedia.org/wiki/Discrete_random_variablehttp://en.wikipedia.org/wiki/Continuous_random_variablehttp://en.wikipedia.org/wiki/Probability_density_functionhttp://en.wikipedia.org/wiki/Conditionally_independenthttp://en.wikipedia.org/wiki/Conditionally_independenthttp://en.wikipedia.org/wiki/Probability_density_functionhttp://en.wikipedia.org/wiki/Continuous_random_variablehttp://en.wikipedia.org/wiki/Discrete_random_variablehttp://en.wikipedia.org/wiki/Conditional_distributionhttp://en.wikipedia.org/wiki/Topological_spacehttp://en.wikipedia.org/wiki/Measurable_spacehttp://en.wikipedia.org/wiki/Real_numberhttp://en.wikipedia.org/wiki/Borel_algebrahttp://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/If_and_only_if
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    6/36

    PREPARED BY ARUN PRATAP SINGH 5

    5

    Independent -algebras[edit]

    The definitions above are both generalized by the following definition of independence for-algebras.

    Let (,,Pr) be a probability space and let Aand Bbe two sub--algebras of .Aand Bare said to

    be independentif, wheneverAAand BB,

    Likewise, a finite family of -algebras is said to be independent if and only if for all

    and an infinite family of -algebras is said to be independent if all its finite subfamilies are independent.

    The new definition relates to the previous ones very directly:

    Two events are independent (in the old sense)if and only if the -algebras that they generate

    are independent (in the new sense). The -algebra generated by an eventE is, by

    definition,

    Two random variablesXand Ydefined over are independent (in the old sense) if and only

    if the -algebras that they generate are independent (in the new sense). The -algebra

    generated by a random variableX taking values in somemeasurable space Sconsists, by

    definition, of all subsets of of the formX1(U), where Uis any measurable subset of S.

    Using this definition, it is easy to show that if X and Y are random variables and Y is constant,

    thenXand Yare independent, since the -algebra generated by a constant random variable is the

    trivial -algebra {, }. Probability zero events cannot affect independence so independence also

    holds if Yis only Pr-almost surely constant.

    Properties :

    Self-dependence

    Note that an event is independent of itselfiff

    .

    Thus if an event or itscomplementalmost surely occurs, it is independent of itself. For example,

    ifAis choosing any number but 0.5 from auniform distribution on theunit interval,Ais independent

    of itself, even though,tautologically,Afully determinesA.

    Expectation and covariance

    IfXand Yare independent, then theexpectation operator Ehas the property

    http://en.wikipedia.org/w/index.php?title=Independence_(probability_theory)&action=edit&section=9http://en.wikipedia.org/wiki/Sigma_algebrahttp://en.wikipedia.org/wiki/Sigma_algebrahttp://en.wikipedia.org/wiki/Sigma_algebrahttp://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/Measurable_spacehttp://en.wikipedia.org/wiki/Almost_surelyhttp://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/Complement_(set_theory)http://en.wikipedia.org/wiki/Almost_surelyhttp://en.wikipedia.org/wiki/Uniform_distribution_(continuous)http://en.wikipedia.org/wiki/Unit_intervalhttp://en.wikipedia.org/wiki/Tautology_(logic)http://en.wikipedia.org/wiki/Expected_valuehttp://en.wikipedia.org/wiki/Expected_valuehttp://en.wikipedia.org/wiki/Tautology_(logic)http://en.wikipedia.org/wiki/Unit_intervalhttp://en.wikipedia.org/wiki/Uniform_distribution_(continuous)http://en.wikipedia.org/wiki/Almost_surelyhttp://en.wikipedia.org/wiki/Complement_(set_theory)http://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/Almost_surelyhttp://en.wikipedia.org/wiki/Measurable_spacehttp://en.wikipedia.org/wiki/If_and_only_ifhttp://en.wikipedia.org/wiki/Sigma_algebrahttp://en.wikipedia.org/w/index.php?title=Independence_(probability_theory)&action=edit&section=9
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    7/36

    PREPARED BY ARUN PRATAP SINGH 6

    6

    and for thecovariance since we have

    so thecovariance cov(X, Y) is zero. (The converse of these, i.e. the proposition that if two random

    variables have a covariance of 0 they must be independent, is not true. Seeuncorrelated.)

    Characteristic function

    Two random variablesXand Yare independent if and only if the characteristic function of the

    random vector (X, Y) satisfies

    In particular the characteristic function of their sum is the product of their marginal characteristic

    functions:

    though the reverse implication is not true. Random variables that satisfy the latter condition are

    calledsub-independent.

    Examples :

    Rolling a die

    The event of getting a 6 the first time a die is rolled and the event of getting a 6 the second time

    are independent. By contrast, the event of getting a 6 the first time a die is rolled and the event

    that the sum of the numbers seen on the first and second trials is 8 are not independent.

    Drawing cards

    If two cards are drawn with replacement from a deck of cards, the event of drawing a red card on

    the first trial and that of drawing a red card on the second trial are independent. By contrast, if two

    cards are drawn without replacement from a deck of cards, the event of drawing a red card on

    the first trial and that of drawing a red card on the second trial are again not independent.

    Pairwise and mutual independence

    Consider the two probability spaces shown. In both cases, P(A) = P(B) = 1/2 and P(C) = 1/4 The first

    space is pairwise independent but not mutually independent. The second space is mutually

    independent. To illustrate the difference, consider conditioning on two events. In the pairwise

    independent case, although, for example, A is independent of both Band C, it is not independent

    of BC:

    http://en.wikipedia.org/wiki/Covariancehttp://en.wikipedia.org/wiki/Covariancehttp://en.wikipedia.org/wiki/Uncorrelatedhttp://en.wikipedia.org/wiki/Characteristic_function_(probability_theory)http://en.wikipedia.org/wiki/Subindependencehttp://en.wikipedia.org/wiki/Subindependencehttp://en.wikipedia.org/wiki/Characteristic_function_(probability_theory)http://en.wikipedia.org/wiki/Uncorrelatedhttp://en.wikipedia.org/wiki/Covariancehttp://en.wikipedia.org/wiki/Covariance
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    8/36

    PREPARED BY ARUN PRATAP SINGH 7

    7

    In the mutually independent case however:

    See also for a three-event example in which

    and yet no two of the three events are pairwise independent.

    Pairwise independent, but not mutually independent, events.

  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    9/36

    PREPARED BY ARUN PRATAP SINGH 8

    8

    Mutually independent events.

    BERNOULLI PROCESS :

    In probability andstatistics,a Bernoulli process is a finite or infinite sequence of binary random

    variables,so it is adiscrete-time stochastic process that takes only two values, canonically 0 and 1.

    The component Bernoulli variablesXiare identical andindependent.Prosaically, a Bernoulli process

    is a repeated coin flipping, possibly with an unfair coin (but with consistent unfairness). Every

    variable Xi in the sequence is associated with a Bernoulli trial or experiment. They all have the

    same Bernoulli distribution. Much of what can be said about the Bernoulli process can also be

    generalized to more than two outcomes (such as the process for a six-sided die); this generalization

    is known as theBernoulli scheme.

    A Bernoulli processis a finite or infinite sequence ofindependentrandom variablesX1,X2,X3, ...,

    such that

    For each i, the value ofXiis either 0 or 1;

    For all values of i, the probability thatXi= 1 is the same numberp.

    In other words, a Bernoulli process is a sequence ofindependent identically distributedBernoulli

    trials.

    Independence of the trials implies that the process is memoryless. Given that the probability pis

    known, past outcomes provide no information about future outcomes. (Ifpis unknown, however,

    the past informs about the future indirectly, through inferences aboutp.)

    If the process is infinite, then from any point the future trials constitute a Bernoulli process identical

    to the whole process, the fresh-start property.

    Interpretation

    The two possible values of eachXiare often called "success" and "failure". Thus, when expressed

    as a number 0 or 1, the outcome may be called the number of successes on the ith "trial".

    http://en.wikipedia.org/wiki/Probabilityhttp://en.wikipedia.org/wiki/Statisticshttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Discrete-time_stochastic_processhttp://en.wikipedia.org/wiki/Statistical_independencehttp://en.wikipedia.org/wiki/Coin_flippinghttp://en.wikipedia.org/wiki/Bernoulli_trialhttp://en.wikipedia.org/wiki/Bernoulli_distributionhttp://en.wikipedia.org/wiki/Bernoulli_schemehttp://en.wikipedia.org/wiki/Statistical_independencehttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Independent_identically_distributedhttp://en.wikipedia.org/wiki/Bernoulli_trialhttp://en.wikipedia.org/wiki/Bernoulli_trialhttp://en.wikipedia.org/wiki/Bernoulli_trialhttp://en.wikipedia.org/wiki/Bernoulli_trialhttp://en.wikipedia.org/wiki/Independent_identically_distributedhttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Statistical_independencehttp://en.wikipedia.org/wiki/Bernoulli_schemehttp://en.wikipedia.org/wiki/Bernoulli_distributionhttp://en.wikipedia.org/wiki/Bernoulli_trialhttp://en.wikipedia.org/wiki/Coin_flippinghttp://en.wikipedia.org/wiki/Statistical_independencehttp://en.wikipedia.org/wiki/Discrete-time_stochastic_processhttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Statisticshttp://en.wikipedia.org/wiki/Probability
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    10/36

    PREPARED BY ARUN PRATAP SINGH 9

    9

    Two other common interpretations of the values are true or false and yes or no. Under any

    interpretation of the two values, the individual variables Xi may be called Bernoulli trialswith

    parameter p.

    In many applications time passes between trials, as the index i increases. In effect, the

    trials X1, X2, ... Xi, ... happen at "points in time" 1, 2, ..., i, .... That passage of time and the

    associated notions of "past" and "future" are not necessary, however. Most generally,

    anyXiandXjin the process are simply two from a set of random variables indexed by {1, 2, ..., n}

    or by {1, 2, 3, ...}, the finite and infinite cases.

    Several random variables and probability distributions beside the Bernoullis may be derived from

    the Bernoulli process:

    The number of successes in the first ntrials, which has abinomial distribution B(n,p)

    The number of trials needed to get r successes, which has a negative binomial

    distribution NB(r,p)

    The number of trials needed to get one success, which has ageometric distribution NB(1,p),

    a special case of the negative binomial distribution

    The negative binomial variables may be interpreted as randomwaiting times.

    Formal definition

    The Bernoulli process can be formalized in the language of probability spaces as a random

    sequence of independent realisations of a random variable that can take values of heads or tails.

    The state space for an individual value is denoted by

    Specifically, one considers thecountably infinitedirect product of copies of . It is

    common to examine either the one-sided set or the two-sided

    set . There is a naturaltopology on this space, called theproduct topology.The sets in

    this topology are finite sequences of coin flips, that is, finite-lengthstrings ofHand T, with the rest

    of (infinitely long) sequence taken as "don't care". These sets of finite sequences are referred to

    as cylinder sets in the product topology. The set of all such strings form a sigma algebra,

    specifically, a Borel algebra. This algebra is then commonly written as where the

    elements of are the finite-length sequences of coin flips (the cylinder sets).

    If the chances of flipping heads or tails are given by the probabilities , then one can

    define a natural measure on the product space, given by (or

    by for the two-sided process). Given a cylinder set, that is, a specific

    sequence of coin flip results at times , the probability of observing

    this particular sequence is given by

    http://en.wikipedia.org/wiki/Bernoulli_trialhttp://en.wikipedia.org/wiki/Binomial_distributionhttp://en.wikipedia.org/wiki/Negative_binomial_distributionhttp://en.wikipedia.org/wiki/Negative_binomial_distributionhttp://en.wikipedia.org/wiki/Geometric_distributionhttp://en.wikipedia.org/wiki/Negative_binomial_distribution#Waiting_time_in_a_Bernoulli_processhttp://en.wikipedia.org/wiki/Probability_spacehttp://en.wikipedia.org/wiki/Countably_infinitehttp://en.wikipedia.org/wiki/Direct_producthttp://en.wikipedia.org/wiki/Topologyhttp://en.wikipedia.org/wiki/Product_topologyhttp://en.wikipedia.org/wiki/String_(computer_science)http://en.wikipedia.org/wiki/Cylinder_sethttp://en.wikipedia.org/wiki/Sigma_algebrahttp://en.wikipedia.org/wiki/Borel_algebrahttp://en.wikipedia.org/wiki/Measure_(mathematics)http://en.wikipedia.org/wiki/Measure_(mathematics)http://en.wikipedia.org/wiki/Borel_algebrahttp://en.wikipedia.org/wiki/Sigma_algebrahttp://en.wikipedia.org/wiki/Cylinder_sethttp://en.wikipedia.org/wiki/String_(computer_science)http://en.wikipedia.org/wiki/Product_topologyhttp://en.wikipedia.org/wiki/Topologyhttp://en.wikipedia.org/wiki/Direct_producthttp://en.wikipedia.org/wiki/Countably_infinitehttp://en.wikipedia.org/wiki/Probability_spacehttp://en.wikipedia.org/wiki/Negative_binomial_distribution#Waiting_time_in_a_Bernoulli_processhttp://en.wikipedia.org/wiki/Geometric_distributionhttp://en.wikipedia.org/wiki/Negative_binomial_distributionhttp://en.wikipedia.org/wiki/Negative_binomial_distributionhttp://en.wikipedia.org/wiki/Binomial_distributionhttp://en.wikipedia.org/wiki/Bernoulli_trial
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    11/36

    PREPARED BY ARUN PRATAP SINGH 10

    10

    where kis the number of times that Happears in the sequence, and n-kis the number of times

    that Tappears in the sequence. There are several different kinds of notations for the above; a

    common one is to write

    where each is a binary-valued random variable. It is common to write for . This

    probability Pis commonly called theBernoulli measure.[1]

    Note that the probability of any specific, infinitely long sequence of coin flips is exactly zero; this

    is because , for any . One says that any given infinite sequence

    hasmeasure zero.Nevertheless, one can still say that some classes of infinite sequences of coin

    flips are far more likely than others, this is given by theasymptotic equipartition property.

    To conclude the formal definition, a Bernoulli process is then given by the probability

    triple , as defined above.

    BINOMIAL DISTRIBUTION :

    Thelaw of large numbers states that, on average, theexpectation value of flipping headsfor any

    one coin flip isp. That is, one writes

    for any one given random variable out of the infinite sequence ofBernoulli trials that compose

    the Bernoulli process.

    One is often interested in knowing how often one will observe Hin a sequence of ncoin flips. This

    is given by simply counting: Given n successive coin flips, that is, given the set of all

    possiblestrings of length n, the number N(k,n) of such strings that contain koccurrences of His

    given by thebinomial coefficient

    If the probability of flipping heads is given by p, then the total probability of seeing a string of

    length nwith kheads is

    This probability is known as theBinomial distribution.

    http://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Bernoulli_measurehttp://en.wikipedia.org/wiki/Bernoulli_process#cite_note-klenke-1http://en.wikipedia.org/wiki/Bernoulli_process#cite_note-klenke-1http://en.wikipedia.org/wiki/Bernoulli_process#cite_note-klenke-1http://en.wikipedia.org/wiki/Measure_zerohttp://en.wikipedia.org/wiki/Asymptotic_equipartition_propertyhttp://en.wikipedia.org/wiki/Law_of_large_numbershttp://en.wikipedia.org/wiki/Expectation_valuehttp://en.wikipedia.org/wiki/Bernoulli_trialhttp://en.wikipedia.org/wiki/String_(computer_science)http://en.wikipedia.org/wiki/Binomial_coefficienthttp://en.wikipedia.org/wiki/Binomial_distributionhttp://en.wikipedia.org/wiki/Binomial_distributionhttp://en.wikipedia.org/wiki/Binomial_coefficienthttp://en.wikipedia.org/wiki/String_(computer_science)http://en.wikipedia.org/wiki/Bernoulli_trialhttp://en.wikipedia.org/wiki/Expectation_valuehttp://en.wikipedia.org/wiki/Law_of_large_numbershttp://en.wikipedia.org/wiki/Asymptotic_equipartition_propertyhttp://en.wikipedia.org/wiki/Measure_zerohttp://en.wikipedia.org/wiki/Bernoulli_process#cite_note-klenke-1http://en.wikipedia.org/wiki/Bernoulli_measurehttp://en.wikipedia.org/wiki/Random_variable
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    12/36

    PREPARED BY ARUN PRATAP SINGH 11

    11

    Of particular interest is the question of the value of P(k,n) for very, very long sequences of coin

    flips, that is, for the limit . In this case, one may make use ofStirling's approximation to

    the factorial, and write

    Inserting this into the expression for P(k,n), one obtains the Gaussian distribution; this is the

    content of thecentral limit theorem,and this is the simplest example thereof.

    The combination of the law of large numbers, together with the central limit theorem, leads to an

    interesting and perhaps surprising result: the asymptotic equipartition property. Put informally,

    one notes that, yes, over many coin flips, one will observe Hexactlypfraction of the time, and

    that this corresponds exactly with the peak of the Gaussian. The asymptotic equipartition property

    essentially states that this peak is infinitely sharp, with infinite fall-off on either side. That is, given

    the set of all possible infinitely long strings of Hand Toccurring in the Bernoulli process, this setis partitioned into two: those strings that occur with probability 1, and those that occur with

    probability 0. This partitioning is known as theKolmogorov 0-1 law.

    The size of this set is interesting, also, and can be explicitly determined: the logarithm of it is

    exactly theentropy of the Bernoulli process. Once again, consider the set of all strings of length n.

    The size of this set is . Of these, only a certain subset are likely; the size of this set

    is for . By using Stirling's approximation, putting it into the expression for P(k,n),

    solving for the location and width of the peak, and finally taking one finds that

    This value is the Bernoulli entropy of a Bernoulli process. Here, H stands for entropy; do not

    confuse it with the same symbol Hstanding for heads.

    von Neumann posed a curious question about the Bernoulli process: is it ever possible that a

    given process isisomorphic to another, in the sense of theisomorphism of dynamical systems?

    The question long defied analysis, but was finally and completely answered with the Ornstein

    isomorphism theorem.This breakthrough resulted in the understanding that the Bernoulli process

    is unique anduniversal;in a certain sense, it is the single most random process possible; nothing

    is 'more' random than the Bernoulli process (although one must be careful with this informal

    statement; certainly, systems that aremixing are, in a certain sense, 'stronger' than the Bernoulliprocess, which is merely ergodic but not mixing. However, such processes do not consist of

    independent random variables: indeed, many purely deterministic, non-random systems can be

    mixing).

    http://en.wikipedia.org/wiki/Stirling%27s_approximationhttp://en.wikipedia.org/wiki/Gaussian_distributionhttp://en.wikipedia.org/wiki/Central_limit_theoremhttp://en.wikipedia.org/wiki/Asymptotic_equipartition_propertyhttp://en.wikipedia.org/wiki/Kolmogorov_0-1_lawhttp://en.wikipedia.org/wiki/Information_entropyhttp://en.wikipedia.org/wiki/Bernoulli_entropyhttp://en.wikipedia.org/wiki/Von_Neumannhttp://en.wikipedia.org/wiki/Isomorphichttp://en.wikipedia.org/wiki/Isomorphism_of_dynamical_systemshttp://en.wikipedia.org/wiki/Ornstein_isomorphism_theoremhttp://en.wikipedia.org/wiki/Ornstein_isomorphism_theoremhttp://en.wikipedia.org/wiki/Universal_propertyhttp://en.wikipedia.org/wiki/Mixing_(mathematics)http://en.wikipedia.org/wiki/Mixing_(mathematics)http://en.wikipedia.org/wiki/Universal_propertyhttp://en.wikipedia.org/wiki/Ornstein_isomorphism_theoremhttp://en.wikipedia.org/wiki/Ornstein_isomorphism_theoremhttp://en.wikipedia.org/wiki/Isomorphism_of_dynamical_systemshttp://en.wikipedia.org/wiki/Isomorphichttp://en.wikipedia.org/wiki/Von_Neumannhttp://en.wikipedia.org/wiki/Bernoulli_entropyhttp://en.wikipedia.org/wiki/Information_entropyhttp://en.wikipedia.org/wiki/Kolmogorov_0-1_lawhttp://en.wikipedia.org/wiki/Asymptotic_equipartition_propertyhttp://en.wikipedia.org/wiki/Central_limit_theoremhttp://en.wikipedia.org/wiki/Gaussian_distributionhttp://en.wikipedia.org/wiki/Stirling%27s_approximation
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    13/36

    PREPARED BY ARUN PRATAP SINGH 12

    12

    POISSON PROCESS :

    Inprobability theory,a Poisson processis astochastic process that counts the number of event and

    the time that these events occur in a given time interval. The time between each pair of consecutive

    events has an exponential distribution with parameter and each of these inter-arrival times is

    assumed to be independent of other inter-arrival times. The process is named after the French

    mathematician Simon Denis Poisson and is a good model of radioactive decay,[1] telephone

    calls[2]and requests for a particular document on a web server,[3]among many other phenomena.

    The Poisson process is acontinuous-time process;the sum of aBernoulli process can be thought of

    as its discrete-time counterpart. A Poisson process is a pure-birth process, the simplest example of

    abirth-death process.It is also apoint process on the real half-line.

    The basic form of Poisson process, often referred to simply as "the Poisson process", is a

    continuous-timecounting process {N(t), t0} that possesses the following properties:

    N(0) = 0

    Independent increments (the numbers of occurrences counted in disjoint intervals are

    independent of each other)

    Stationary increments (the probability distribution of the number of occurrences counted in any

    time interval only depends on the length of the interval)

    Theprobability distribution of N(t) is aPoisson distribution.

    No counted occurrences are simultaneous.

    Consequences of this definition include:

    The probability distribution of the waiting time until the next occurrence is an exponential

    distribution.

    The occurrences aredistributed uniformly on any interval of time. (Note that N(t), the total

    number of occurrences, has a Poisson distribution over (0, t], whereas the location of an

    individual occurrence on t(a, b]is uniform.)

    Other types of Poisson process are described below.

    1. Homogeneous

    2. Non- Homogeneous

    http://en.wikipedia.org/wiki/Probability_theoryhttp://en.wikipedia.org/wiki/Stochastic_processhttp://en.wikipedia.org/wiki/Exponential_distributionhttp://en.wikipedia.org/wiki/Sim%C3%A9on_Denis_Poissonhttp://en.wikipedia.org/wiki/Radioactive_decayhttp://en.wikipedia.org/wiki/Poisson_process#cite_note-2http://en.wikipedia.org/wiki/Poisson_process#cite_note-2http://en.wikipedia.org/wiki/Poisson_process#cite_note-2http://en.wikipedia.org/wiki/Poisson_process#cite_note-3http://en.wikipedia.org/wiki/Poisson_process#cite_note-3http://en.wikipedia.org/wiki/Poisson_process#cite_note-3http://en.wikipedia.org/wiki/Poisson_process#cite_note-ArlittMartin-4http://en.wikipedia.org/wiki/Poisson_process#cite_note-ArlittMartin-4http://en.wikipedia.org/wiki/Poisson_process#cite_note-ArlittMartin-4http://en.wikipedia.org/wiki/Continuous-time_processhttp://en.wikipedia.org/wiki/Bernoulli_processhttp://en.wikipedia.org/wiki/Birth-death_processhttp://en.wikipedia.org/wiki/Point_processhttp://en.wikipedia.org/wiki/Counting_processhttp://en.wikipedia.org/wiki/Independent_incrementshttp://en.wikipedia.org/wiki/Stationary_incrementshttp://en.wikipedia.org/wiki/Probability_distributionhttp://en.wikipedia.org/wiki/Poisson_distributionhttp://en.wikipedia.org/wiki/Exponential_distributionhttp://en.wikipedia.org/wiki/Exponential_distributionhttp://en.wikipedia.org/wiki/Uniform_distribution_(continuous)http://en.wikipedia.org/wiki/Uniform_distribution_(continuous)http://en.wikipedia.org/wiki/Exponential_distributionhttp://en.wikipedia.org/wiki/Exponential_distributionhttp://en.wikipedia.org/wiki/Poisson_distributionhttp://en.wikipedia.org/wiki/Probability_distributionhttp://en.wikipedia.org/wiki/Stationary_incrementshttp://en.wikipedia.org/wiki/Independent_incrementshttp://en.wikipedia.org/wiki/Counting_processhttp://en.wikipedia.org/wiki/Point_processhttp://en.wikipedia.org/wiki/Birth-death_processhttp://en.wikipedia.org/wiki/Bernoulli_processhttp://en.wikipedia.org/wiki/Continuous-time_processhttp://en.wikipedia.org/wiki/Poisson_process#cite_note-ArlittMartin-4http://en.wikipedia.org/wiki/Poisson_process#cite_note-3http://en.wikipedia.org/wiki/Poisson_process#cite_note-2http://en.wikipedia.org/wiki/Radioactive_decayhttp://en.wikipedia.org/wiki/Sim%C3%A9on_Denis_Poissonhttp://en.wikipedia.org/wiki/Exponential_distributionhttp://en.wikipedia.org/wiki/Stochastic_processhttp://en.wikipedia.org/wiki/Probability_theory
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    14/36

    PREPARED BY ARUN PRATAP SINGH 13

    13

    Sample Path of a counting Poisson process

    Homogeneous :

    The homogeneousPoisson process counts events that occur at a constant rate; it is one of the

    most well-knownLvy processes.This process is characterized by a rate parameter , also knownas intensity, such that the number of events in time interval (t, t + ] follows a Poisson

    distribution with associated parameter . This relation is given as

    where N(t+ ) N(t) = kis the number of events in time interval ( t, t+ ].

    Just as a Poisson random variable is characterized by its scalar parameter , a homogeneous

    Poisson process is characterized by its rate parameter , which is the expected number of

    "events" or "arrivals" that occur per unit time.N(t) is a sample homogeneous Poisson process, not to be confused with a density or distribution

    function.

    Non-homogeneous :

    A non-homogeneous Poisson process counts events that occur at a variable rate. In general, the

    rate parameter may change over time; such a process is called a non-homogeneous Poisson

    processor inhomogeneous Poisson process. In this case, the generalized rate function is

    given as (t). Now the expected number of events between time aand time bis

    Thus, the number of arrivals in the time interval (a, b], given as N(b) N(a), follows aPoisson

    distribution with associated parameter a,b

    http://en.wikipedia.org/wiki/L%C3%A9vy_processhttp://en.wikipedia.org/wiki/L%C3%A9vy_processhttp://en.wikipedia.org/wiki/Interval_(mathematics)http://en.wikipedia.org/wiki/Poisson_distributionhttp://en.wikipedia.org/wiki/Poisson_distributionhttp://en.wikipedia.org/wiki/Expected_valuehttp://en.wikipedia.org/wiki/Poisson_distributionhttp://en.wikipedia.org/wiki/Poisson_distributionhttp://en.wikipedia.org/wiki/Poisson_distributionhttp://en.wikipedia.org/wiki/Poisson_distributionhttp://en.wikipedia.org/wiki/Expected_valuehttp://en.wikipedia.org/wiki/Poisson_distributionhttp://en.wikipedia.org/wiki/Poisson_distributionhttp://en.wikipedia.org/wiki/Interval_(mathematics)http://en.wikipedia.org/wiki/L%C3%A9vy_process
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    15/36

    PREPARED BY ARUN PRATAP SINGH 14

    14

    A rate function (t) in a non-homogeneous Poisson process can be either a deterministic function

    of time or an independent stochastic process, giving rise to a Cox process.A homogeneous

    Poisson process may be viewed as a special case when (t) = , a constant rate.

    RENEWAL PROCESS :

    Renewal theory is the branch of probability theory that generalizes Poisson processes for

    arbitrary holding times. Applications include calculating the expected time for a monkey who is

    randomly tapping at a keyboard to type the word Macbethand comparing the long-term benefits of

    different insurance policies.

    A renewal processis a generalization of the Poisson process. In essence, the Poisson process is

    a continuous-time Markov process on the positive integers (usually starting at zero) which

    has independent identically distributed holding times at each integer (exponentially distributed)

    before advancing (with probability 1) to the next integer: . In the same informal spirit, we may

    define a renewal process to be the same thing, except that the holding times take on a more general

    distribution. (Note however that the independence and identical distribution (IID) property of theholding times is retained).

    Let be a sequence of positive independent identically

    distributedrandom variables such that

    We refer to the random variable as the " th" holding time.

    Define for each n> 0 :

    each referred to as the " th"jump timeand the intervals

    being called renewal intervals.

    Then the random variable given by

    (where is theindicator function)represents the number of jumps that have occurred by time t,

    and is called a renewal process.

    http://en.wikipedia.org/wiki/Cox_processhttp://en.wikipedia.org/wiki/Probability_theoryhttp://en.wikipedia.org/wiki/Poisson_processeshttp://en.wikipedia.org/wiki/Infinite_monkey_theoremhttp://en.wikipedia.org/wiki/Infinite_monkey_theoremhttp://en.wikipedia.org/wiki/Continuous-time_Markov_processhttp://en.wikipedia.org/wiki/Independent_identically_distributedhttp://en.wikipedia.org/wiki/Exponential_distributionhttp://en.wikipedia.org/wiki/IIDhttp://en.wikipedia.org/wiki/Independent_identically_distributedhttp://en.wikipedia.org/wiki/Independent_identically_distributedhttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Indicator_functionhttp://en.wikipedia.org/wiki/Indicator_functionhttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Independent_identically_distributedhttp://en.wikipedia.org/wiki/Independent_identically_distributedhttp://en.wikipedia.org/wiki/IIDhttp://en.wikipedia.org/wiki/Exponential_distributionhttp://en.wikipedia.org/wiki/Independent_identically_distributedhttp://en.wikipedia.org/wiki/Continuous-time_Markov_processhttp://en.wikipedia.org/wiki/Infinite_monkey_theoremhttp://en.wikipedia.org/wiki/Infinite_monkey_theoremhttp://en.wikipedia.org/wiki/Poisson_processeshttp://en.wikipedia.org/wiki/Probability_theoryhttp://en.wikipedia.org/wiki/Cox_process
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    16/36

    PREPARED BY ARUN PRATAP SINGH 15

    15

    Sample evolution of a renewal process with holding timesSiand jump times Jn.

    The renewal equation

    The renewal function satisfies :

    where is the cumulative distribution function of and is the corresponding probability

    density function.

    Proof of the renewal equation :

    We may iterate the expectation about the first holding time:

    But by theMarkov property

    So

    http://en.wikipedia.org/wiki/Markov_propertyhttp://en.wikipedia.org/wiki/Markov_property
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    17/36

    PREPARED BY ARUN PRATAP SINGH 16

    16

    as required.

    RANDOM INCIDENCE :

    The Poisson process is one of many stochastic processes that one encounters in urban servicesystems. The Poisson process is one example of a "point process" in which discrete events(arrivals) occur at particular points in time. For a general point process having its zero tharrival attime T0 and the remaining arrivals at times T1, T2, T3, . . ., the interarrival times are

    Such a stochastic process is fully characterized by the family of joint

    pdf's for all integer values of p and all possiblecombinations of different n1, n2, . . ., where each n i is a positive integer denoting a particularinterarrival time. Maintaining the depiction of a stochastic process at such a general level,although fine in theory, yields an intractable model and one for which the data (to estimate all the

    joint pdf 's) are virtually impossible to obtain. So, in the study of stochastic processes, one ismotivated to make assumptions about this family of pdf's that

  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    18/36

    PREPARED BY ARUN PRATAP SINGH 17

    17

    (1) are realistic for an important class of problems and(2) yield a tractable model.

    We wish to consider here the class of point stochastic processes for which the marginal pdf's forall of the interarrival times (Yk) are identical. That is, we assume that

    Thus, for Yk, if we selected any one of the family of joint pdf's fYn1,Yn2, . . ., Ynp(yn1, yn2, . . . , yk, . . .,

    ynP) and "integrated out" all variables except yk, we would obtain fY(.). Note that we have said

    nothing about independence of the Yk's

    They need not be mutually independent, pairwise independent, or conditionally independent in

    any way. For the special case in which the Yk's aremutually independent, the point process is

    called a renewal process. The Poisson process is a special case of a renewal process, being

    the only continuous-time renewal process having "no memory." However, the kind of process

    we are considering can exhibit both memory and dependence among the inter-event times. In

    fact, the dependence could be so strong that once we know the value of one of the Y k's we

    might know a great deal (perhaps even the exact values) of any number of the remaining Yk's.

    Example :

    Consider a potential bus passenger arriving at a bus stop. The kth bus arrives Y ktime units after

    the (k - 1)st bus. Here the Yk's are called bus headways. The probabilistic behavior of the Yk's will

    determine the probability law for the waiting time of the potential passenger (until the next bus

    arrives). Here it is reasonable to assume that the Yk's are identically distributed but not

    independent (due to interactions between successive buses). One could estimate the pdf fY(.)

    simply by gathering data describing bus interarrival times and displaying the data in the form of a

    histogram. (This same model applies to subways and even elevators in a multielevator building.)

    Suppose that buses maintain perfect headway; that is, they are always T0minutes apart. Then

  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    19/36

    PREPARED BY ARUN PRATAP SINGH 18

    18

    That is, the time until the next bus arrives, given random incidence, is uniformly distributed

    between 0 and T0, with a mean E[V] = T0/2, as we might expect intuitively.

    MARKOV MODULATED BERNOULLI PROCESS :

    The Markov-Modulated Bernoulli Process (MMBP) model is used to analyze the delay

    experienced by messages in clocked, packed-switched Banyan networks with kx koutput-

    buffered switches. This approach allows us to analyze both single packet messages and

    multipacket messages with general traffic pattern including uniform traffic, hot-spot traffic,

    locality of reference, etc. The ability to analyze multipacket messages is very important for

    multimedia applications. Previous work, which is only applicable to restricted message andtraffic patterns, resorts to either heuristic correction factors to artificially tune the model or

    tedious computational efforts. In contrast, the proposed model, which is applicable to much

    more general message and traffic patterns, not only is an application of a theoretically

    complete model but also requires a minimal amount of computational effort. In all cases, the

    analytical results are compared with results obtained by simulation and are shown to be very

    accurate.

  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    20/36

    PREPARED BY ARUN PRATAP SINGH 19

    19

  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    21/36

    PREPARED BY ARUN PRATAP SINGH 20

    20

  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    22/36

    PREPARED BY ARUN PRATAP SINGH 21

    21

    DTMC - Discrete Time Markov Chains

    IRREDUCIBLE FINITE CHAINS WITH APERIODIC STATES :

    A Markov chain (discrete-time Markov chain or DTMC) named after Andrey Markov, is a

    mathematical system that undergoes transitions from one state to another on a state space. It is

    arandom process usually characterized asmemoryless:the next state depends only on the current

    state and not on the sequence of events that preceded it. This specific kind of "memorylessness" is

    called theMarkov property.Markov chains have many applications asstatistical models of real-world

    processes.

    http://en.wikipedia.org/wiki/Andrey_Markovhttp://en.wikipedia.org/wiki/State_spacehttp://en.wikipedia.org/wiki/Stochastic_processhttp://en.wikipedia.org/wiki/Memorylessnesshttp://en.wikipedia.org/wiki/Markov_propertyhttp://en.wikipedia.org/wiki/Statistical_modelinghttp://en.wikipedia.org/wiki/Statistical_modelinghttp://en.wikipedia.org/wiki/Markov_propertyhttp://en.wikipedia.org/wiki/Memorylessnesshttp://en.wikipedia.org/wiki/Stochastic_processhttp://en.wikipedia.org/wiki/State_spacehttp://en.wikipedia.org/wiki/Andrey_Markov
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    23/36

    PREPARED BY ARUN PRATAP SINGH 22

    22

  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    24/36

    PREPARED BY ARUN PRATAP SINGH 23

    23

    DISCRETE TIME BIRTH DEATH PROCESS :

    The birth

    death process is a special case of continuous-time Markov process where the statetransitions are of only two types: "births", which increase the state variable by one and "deaths", which

    decrease the state by one. The model's name comes from a common application, the use of such

    models to represent the current size of a population where the transitions are literal births and deaths.

    Birthdeath processes have many applications in demography, queueing theory, performance

    engineering, epidemiology or in biology. They may be used, for example to study the evolution

    http://en.wikipedia.org/wiki/Continuous-time_Markov_processhttp://en.wikipedia.org/wiki/Demographyhttp://en.wikipedia.org/wiki/Queueing_theoryhttp://en.wikipedia.org/wiki/Performance_engineeringhttp://en.wikipedia.org/wiki/Performance_engineeringhttp://en.wikipedia.org/wiki/Epidemiologyhttp://en.wikipedia.org/wiki/Biologyhttp://en.wikipedia.org/wiki/Biologyhttp://en.wikipedia.org/wiki/Epidemiologyhttp://en.wikipedia.org/wiki/Performance_engineeringhttp://en.wikipedia.org/wiki/Performance_engineeringhttp://en.wikipedia.org/wiki/Queueing_theoryhttp://en.wikipedia.org/wiki/Demographyhttp://en.wikipedia.org/wiki/Continuous-time_Markov_process
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    25/36

    PREPARED BY ARUN PRATAP SINGH 24

    24

    ofbacteria,the number of people with a disease within a population, or the number of customers in

    line at the supermarket.

    When a birth occurs, the process goes from state nto n+ 1. When a death occurs, the process goes

    from state n to state n 1. The process is specified by birth rates and death

    rates .

    Example :

    A pure birth processis a birthdeath process where for all .

    A pure death processis a birthdeath process where for all .

    A (homogeneous)Poisson process is a pure birth process where for all

    M/M/1 modelandM/M/c model, both used in queueing theory,are birthdeath processes used to

    describe customers in an infinite queue.

    Use in queueing theory :

    In queueing theory the birthdeath process is the most fundamental example of aqueueing model,

    the M/M/C/K/ /FIFO(in completeKendall's notation)queue. This is a queue with Poisson arrivals,

    drawn from an infinite population, and Cservers withexponentially distributed service time

    with Kplaces in the queue. Despite the assumption of an infinite population this model is a good

    model for various telecommunication systems.

    M/M/1 queue

    The M/M/1is a single server queue with an infinite buffer size. In a non-random environment the

    birthdeath process in queueing models tend to be long-term averages, so the average rate of

    arrival is given as and the average service time as . The birth and death process is aM/M/1 queue when,

    Thedifference equations for theprobability that the system is in state kat time tare,

    http://en.wikipedia.org/wiki/Bacteriahttp://en.wikipedia.org/wiki/Poisson_processhttp://en.wikipedia.org/wiki/M/M/1_modelhttp://en.wikipedia.org/wiki/M/M/1_modelhttp://en.wikipedia.org/wiki/M/M/c_modelhttp://en.wikipedia.org/wiki/M/M/c_modelhttp://en.wikipedia.org/wiki/M/M/c_modelhttp://en.wikipedia.org/wiki/Queueing_theoryhttp://en.wikipedia.org/wiki/Queueing_modelhttp://en.wikipedia.org/wiki/Kendall%27s_notationhttp://en.wikipedia.org/wiki/Exponential_distributionhttp://en.wikipedia.org/wiki/Difference_equationshttp://en.wikipedia.org/wiki/Probabilityhttp://en.wikipedia.org/wiki/File:BD-proces.pnghttp://en.wikipedia.org/wiki/Probabilityhttp://en.wikipedia.org/wiki/Difference_equationshttp://en.wikipedia.org/wiki/Exponential_distributionhttp://en.wikipedia.org/wiki/Kendall%27s_notationhttp://en.wikipedia.org/wiki/Queueing_modelhttp://en.wikipedia.org/wiki/Queueing_theoryhttp://en.wikipedia.org/wiki/M/M/c_modelhttp://en.wikipedia.org/wiki/M/M/1_modelhttp://en.wikipedia.org/wiki/Poisson_processhttp://en.wikipedia.org/wiki/Bacteria
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    26/36

    PREPARED BY ARUN PRATAP SINGH 25

    25

    M/M/c queue

    The M/M/c is a multi-server queue with C servers and an infinite buffer. This differs from the

    M/M/1 queue only in the service time, which now becomes

    and

    with

    M/M/1/K queue

    The M/M/1/K queue is a single server queue with a buffer of size K. This queue has applications

    in telecommunications, as well as in biology when a population has a capacity limit. In

    telecommunication we again use the parameters from the M/M/1 queue with,

    In biology, particularly the growth of bacteria, when the population is zero there is no ability to

    grow so,

    Additionally if the capacity represents a limit where the population dies from over population,

    The differential equations for the probability that the system is in state kat time tare,

    MARKOV PROPERTY :

    Inprobability theory andstatistics,the term Markov propertyrefers to the memoryless property of

    astochastic process.It is named after theRussianmathematicianAndrey Markov.[1]

    A stochastic process has the Markov property if theconditional probability distribution of future states

    of the process (conditional on both past and present values) depends only upon the present state, not

    on the sequence of events that preceded it. A process with this property is called aMarkov process.

    The term strong Markov property is similar to the Markov property, except that the meaning of

    http://en.wikipedia.org/wiki/Probability_theoryhttp://en.wikipedia.org/wiki/Statisticshttp://en.wikipedia.org/wiki/Stochastic_processhttp://en.wikipedia.org/wiki/Russiahttp://en.wikipedia.org/wiki/Mathematicianhttp://en.wikipedia.org/wiki/Andrey_Markovhttp://en.wikipedia.org/wiki/Markov_property#cite_note-1http://en.wikipedia.org/wiki/Markov_property#cite_note-1http://en.wikipedia.org/wiki/Markov_property#cite_note-1http://en.wikipedia.org/wiki/Conditional_probability_distributionhttp://en.wikipedia.org/wiki/Markov_processhttp://en.wikipedia.org/wiki/Markov_processhttp://en.wikipedia.org/wiki/Markov_processhttp://en.wikipedia.org/wiki/Markov_processhttp://en.wikipedia.org/wiki/Conditional_probability_distributionhttp://en.wikipedia.org/wiki/Markov_property#cite_note-1http://en.wikipedia.org/wiki/Andrey_Markovhttp://en.wikipedia.org/wiki/Mathematicianhttp://en.wikipedia.org/wiki/Russiahttp://en.wikipedia.org/wiki/Stochastic_processhttp://en.wikipedia.org/wiki/Statisticshttp://en.wikipedia.org/wiki/Probability_theory
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    27/36

    PREPARED BY ARUN PRATAP SINGH 26

    26

    "present" is defined in terms of a random variable known as a stopping time.Both the terms "Markov

    property" and "strong Markov property" have been used in connection with a particular "memoryless"

    property of theexponential distribution.[2]

    The term Markov assumptionis used to describe a model where the Markov property is assumed to

    hold, such as ahidden Markov model.

    A Markov random field[3] extends this property to two or more dimensions or to random variables

    defined for an interconnected network of items. An example of a model for such a field is the Ising

    model.

    A discrete-time stochastic process satisfying the Markov property is known as aMarkov chain.

    FINITE MARKOV CHAIN :

    http://en.wikipedia.org/wiki/Stopping_timehttp://en.wikipedia.org/wiki/Exponential_distribution#Memorylessnesshttp://en.wikipedia.org/wiki/Markov_property#cite_note-2http://en.wikipedia.org/wiki/Markov_property#cite_note-2http://en.wikipedia.org/wiki/Markov_property#cite_note-2http://en.wikipedia.org/wiki/Hidden_Markov_modelhttp://en.wikipedia.org/wiki/Markov_random_fieldhttp://en.wikipedia.org/wiki/Markov_random_fieldhttp://en.wikipedia.org/wiki/Markov_random_fieldhttp://en.wikipedia.org/wiki/Ising_modelhttp://en.wikipedia.org/wiki/Ising_modelhttp://en.wikipedia.org/wiki/Markov_chainhttp://en.wikipedia.org/wiki/Markov_chainhttp://en.wikipedia.org/wiki/Ising_modelhttp://en.wikipedia.org/wiki/Ising_modelhttp://en.wikipedia.org/wiki/Markov_random_fieldhttp://en.wikipedia.org/wiki/Markov_random_fieldhttp://en.wikipedia.org/wiki/Hidden_Markov_modelhttp://en.wikipedia.org/wiki/Markov_property#cite_note-2http://en.wikipedia.org/wiki/Exponential_distribution#Memorylessnesshttp://en.wikipedia.org/wiki/Stopping_time
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    28/36

    PREPARED BY ARUN PRATAP SINGH 27

    27

  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    29/36

    PREPARED BY ARUN PRATAP SINGH 28

    28

  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    30/36

    PREPARED BY ARUN PRATAP SINGH 29

    29

    CONTINUOUS-TIME MARKOV CHAIN :

    In probability theory, a continuous-time Markov chain (CTMC[1] or continuous-time Markov

    process[2])is a mathematical model which takes values in some finite or countable set and for whichthe time spent in each state takes non-negativereal values and has anexponential distribution.It is

    acontinuous-time stochastic process with theMarkov property which means that future behaviour of

    the model (both remaining time in current state and next state) depends only on the current state of

    the model and not on historical behaviour. The model is a continuous-time version of the Markov

    chain model, named because the output from such a process is a sequence (or chain) of states.

    http://en.wikipedia.org/wiki/Probability_theoryhttp://en.wikipedia.org/wiki/Continuous-time_Markov_chain#cite_note-1http://en.wikipedia.org/wiki/Continuous-time_Markov_chain#cite_note-1http://en.wikipedia.org/wiki/Continuous-time_Markov_chain#cite_note-2http://en.wikipedia.org/wiki/Continuous-time_Markov_chain#cite_note-2http://en.wikipedia.org/wiki/Real_numberhttp://en.wikipedia.org/wiki/Exponential_distributionhttp://en.wikipedia.org/wiki/Continuous-time_stochastic_processhttp://en.wikipedia.org/wiki/Markov_propertyhttp://en.wikipedia.org/wiki/Markov_chainhttp://en.wikipedia.org/wiki/Markov_chainhttp://en.wikipedia.org/wiki/Markov_chainhttp://en.wikipedia.org/wiki/Markov_chainhttp://en.wikipedia.org/wiki/Markov_propertyhttp://en.wikipedia.org/wiki/Continuous-time_stochastic_processhttp://en.wikipedia.org/wiki/Exponential_distributionhttp://en.wikipedia.org/wiki/Real_numberhttp://en.wikipedia.org/wiki/Continuous-time_Markov_chain#cite_note-2http://en.wikipedia.org/wiki/Continuous-time_Markov_chain#cite_note-1http://en.wikipedia.org/wiki/Probability_theory
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    31/36

    PREPARED BY ARUN PRATAP SINGH 30

    30

    A continuous-time Markov chain (Xt)t 0 is defined by a finite or countable state space S,atransition rate matrix Qwith dimensions equal to that of the state space and initial probability

    distribution defined on the state space. For ij, the elements qijare non-negative and describe

    the rate the process transitions from state ito statej. The elements qiiare chosen such that each

    row of the transition rate matrix sums to zero.

    There are three equivalent definitions of the process.[3]

    Infinitesimal definition

    LetXtbe the random variable describing the state of the process at time t, and assume that the

    process is in a state iat time t. ThenXt+ his independent of previous values (Xs: st) and as h

    0 uniformly in tfor allj

    using little-o notation. The qij can be seen as measuring how quickly the transition

    from itojhappens

    Jump chain/holding time definition

    Define a discrete-time Markov chain Yn to describe the nth jump of the process and

    variables S1, S2, S3, ... to describe holding times in each of the states where the distribution ofSiis

    given by qYiYi.Transition probability definition

    For any value n= 0, 1, 2, 3, ... and times indexed up to this value of n: t0, t1, t2, ... and all states

    recorded at these times i0, i1, i2, i3, ... it holds that

    wherepijis the solution of theforward equation (afirst-order differential equation)

    http://en.wikipedia.org/wiki/Transition_rate_matrixhttp://en.wikipedia.org/wiki/Continuous-time_Markov_chain#cite_note-norris1-3http://en.wikipedia.org/wiki/Continuous-time_Markov_chain#cite_note-norris1-3http://en.wikipedia.org/wiki/Continuous-time_Markov_chain#cite_note-norris1-3http://en.wikipedia.org/wiki/Little-o_notationhttp://en.wikipedia.org/wiki/Forward_equationhttp://en.wikipedia.org/wiki/First-order_differential_equationhttp://en.wikipedia.org/wiki/First-order_differential_equationhttp://en.wikipedia.org/wiki/Forward_equationhttp://en.wikipedia.org/wiki/Little-o_notationhttp://en.wikipedia.org/wiki/Continuous-time_Markov_chain#cite_note-norris1-3http://en.wikipedia.org/wiki/Transition_rate_matrix
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    32/36

    PREPARED BY ARUN PRATAP SINGH 31

    31

    with initial condition P(0) is theidentity matrix.

    HIDDEN MARKOV MODEL :

    A hidden Markov model(HMM) is astatisticalMarkov model in which the system being modeled is

    assumed to be aMarkov process with unobserved (hidden) states. A HMM can be considered the

    simplest dynamic Bayesian network. The mathematics behind the HMM was developed by L. E.

    Baum and coworkers.[1][2][3][4][5] It is closely related to an earlier work on optimal nonlinearfiltering

    problem (stochastic processes) byRuslan L. Stratonovich,[6]who was the first to describe theforward-

    backward procedure.

    In simpler Markov models (like a Markov chain), the state is directly visible to the observer, and

    therefore the state transition probabilities are the only parameters. In a hiddenMarkov model, the state

    is not directly visible, but output, dependent on the state, is visible. Each state has a probability

    distribution over the possible output tokens. Therefore the sequence of tokens generated by an HMM

    gives some information about the sequence of states. Note that the adjective 'hidden' refers to the

    state sequence through which the model passes, not to the parameters of the model; the model is still

    referred to as a 'hidden' Markov model even if these parameters are known exactly.

    http://en.wikipedia.org/wiki/Identity_matrixhttp://en.wikipedia.org/wiki/Statistical_modelhttp://en.wikipedia.org/wiki/Markov_modelhttp://en.wikipedia.org/wiki/Markov_processhttp://en.wikipedia.org/wiki/Dynamic_Bayesian_networkhttp://en.wikipedia.org/wiki/Leonard_E._Baumhttp://en.wikipedia.org/wiki/Leonard_E._Baumhttp://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-1http://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-1http://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-3http://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-5http://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-5http://en.wikipedia.org/wiki/Filtering_problem_(stochastic_processes)http://en.wikipedia.org/wiki/Filtering_problem_(stochastic_processes)http://en.wikipedia.org/wiki/Ruslan_L._Stratonovichhttp://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-Stratonovich1960-6http://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-Stratonovich1960-6http://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-Stratonovich1960-6http://en.wikipedia.org/wiki/Forward%E2%80%93backward_algorithmhttp://en.wikipedia.org/wiki/Forward%E2%80%93backward_algorithmhttp://en.wikipedia.org/wiki/Markov_modelhttp://en.wikipedia.org/wiki/Markov_chainhttp://en.wikipedia.org/wiki/Markov_chainhttp://en.wikipedia.org/wiki/Markov_modelhttp://en.wikipedia.org/wiki/Forward%E2%80%93backward_algorithmhttp://en.wikipedia.org/wiki/Forward%E2%80%93backward_algorithmhttp://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-Stratonovich1960-6http://en.wikipedia.org/wiki/Ruslan_L._Stratonovichhttp://en.wikipedia.org/wiki/Filtering_problem_(stochastic_processes)http://en.wikipedia.org/wiki/Filtering_problem_(stochastic_processes)http://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-5http://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-3http://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-3http://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-1http://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-1http://en.wikipedia.org/wiki/Leonard_E._Baumhttp://en.wikipedia.org/wiki/Leonard_E._Baumhttp://en.wikipedia.org/wiki/Dynamic_Bayesian_networkhttp://en.wikipedia.org/wiki/Markov_processhttp://en.wikipedia.org/wiki/Markov_modelhttp://en.wikipedia.org/wiki/Statistical_modelhttp://en.wikipedia.org/wiki/Identity_matrix
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    33/36

    PREPARED BY ARUN PRATAP SINGH 32

    32

    Hidden Markov models are especially known for their application intemporal pattern recognition such

    as speech, handwriting, gesture recognition,[7] part-of-speech tagging, musical score

    following,[8]partial discharges[9]andbioinformatics.

    A hidden Markov model can be considered a generalization of a mixture model where the hidden

    variables (or latent variables), which control the mixture component to be selected for each

    observation, are related through a Markov process rather than independent of each other. Recently,

    hidden Markov models have been generalized to pairwise Markov models and triplet Markov models

    which allow to consider more complex data structures[10][11]and to model nonstationary data.

    http://en.wikipedia.org/wiki/Timehttp://en.wikipedia.org/wiki/Speech_recognitionhttp://en.wikipedia.org/wiki/Handwriting_recognitionhttp://en.wikipedia.org/wiki/Gesture_recognitionhttp://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-7http://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-7http://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-7http://en.wikipedia.org/wiki/Part-of-speech_tagginghttp://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-8http://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-8http://en.wikipedia.org/wiki/Partial_dischargehttp://en.wikipedia.org/wiki/Partial_dischargehttp://en.wikipedia.org/wiki/Partial_dischargehttp://en.wikipedia.org/wiki/Bioinformaticshttp://en.wikipedia.org/wiki/Mixture_modelhttp://en.wikipedia.org/wiki/Latent_variableshttp://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-TMMEV-10http://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-TMMEV-10http://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-TMMEV-10http://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-TMMEV-10http://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-TMMEV-10http://en.wikipedia.org/wiki/Latent_variableshttp://en.wikipedia.org/wiki/Mixture_modelhttp://en.wikipedia.org/wiki/Bioinformaticshttp://en.wikipedia.org/wiki/Partial_dischargehttp://en.wikipedia.org/wiki/Partial_dischargehttp://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-8http://en.wikipedia.org/wiki/Part-of-speech_tagginghttp://en.wikipedia.org/wiki/Hidden_Markov_model#cite_note-7http://en.wikipedia.org/wiki/Gesture_recognitionhttp://en.wikipedia.org/wiki/Handwriting_recognitionhttp://en.wikipedia.org/wiki/Speech_recognitionhttp://en.wikipedia.org/wiki/Time
  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    34/36

    PREPARED BY ARUN PRATAP SINGH 33

    33

  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    35/36

    PREPARED BY ARUN PRATAP SINGH 34

    34

  • 8/12/2019 Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh

    36/36

    35