Concise Introduction to Econometrics an Intuitive Guide 1 Introduction
An intuitive introduction to information theory
description
Transcript of An intuitive introduction to information theory
![Page 1: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/1.jpg)
An intuitive introduction to An intuitive introduction to information theoryinformation theory
Ivo Grosse
Leibniz Institute of Plant Genetics and Crop Plant Research Gatersleben
Bioinformatics Centre Gatersleben-Halle
![Page 2: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/2.jpg)
22
OutlineOutline
Why information theory?
An intuitive introduction
![Page 3: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/3.jpg)
33
History of biologyHistory of biology
St. Thomas Monastry, Brno
![Page 4: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/4.jpg)
44
GeneticsGenetics
Gregor Mendel1822 – 1884
1866 Mendel‘s laws
Foundation of Genetics
Ca. 1900:Biology becomes a quantitative science
![Page 5: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/5.jpg)
55
50 years later … 195350 years later … 1953
James Watson & Francis Crick
![Page 6: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/6.jpg)
66
50 years later … 195350 years later … 1953
![Page 7: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/7.jpg)
77
![Page 8: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/8.jpg)
88
DNADNA
Watson & Crick1953
Double helix structureof DNA
1953:Biology becomes amolecular science
![Page 9: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/9.jpg)
99
1953 – 2003 … 50 years of revolutionary 1953 – 2003 … 50 years of revolutionary discoveriesdiscoveries
![Page 10: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/10.jpg)
1010
19891989
![Page 11: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/11.jpg)
1111
19891989
Goals:
Identify all of the ca. 30.000 genes
Identify all of the ca. 3.000.000.000 base pairs
Store all information in databases
Develop new software for data analysis
![Page 12: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/12.jpg)
1212
2003 Human Genome Project officially finished2003 Human Genome Project officially finished
2003: Biology becomes an information science
![Page 13: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/13.jpg)
1313
2003 – 2053 … biology = information science2003 – 2053 … biology = information science
![Page 14: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/14.jpg)
1414
2003 – 2053 … biology = information science2003 – 2053 … biology = information science
SystemsSystemsBiologyBiology
![Page 15: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/15.jpg)
1515
What is information?What is information?
Many intuitive definitions
Most of them wrong
One clean definition since 1948
Requires 3 steps- Entropy- Conditional entropy- Mutual information
![Page 16: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/16.jpg)
1616
Before starting with entropy …Before starting with entropy …
Who is the father of informationtheory?
Who is this?
Claude Shannon1916 – 2001
A Mathematical Theory of Communication. Bell SystemTechnical Journal, 27, 379–423 & 623–656, 1948
![Page 17: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/17.jpg)
1717
Before starting with entropy …Before starting with entropy …
Who is the grandfather ofinformation theory?
Simon bar KochbaCa. 100 – 135
Jewish guerilla fighter againstRoman Empire (132 – 135)
![Page 18: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/18.jpg)
1818
EntropyEntropy
Given a text composed from an alphabet of 32 letters (each letter equally probable)
Person A chooses a letter X (randomly) Person B wants to know this letter B may ask only binary questions
Question: how many binary questions must B ask in order to learn which letter X was chosen by A
Answer: entropy H(X)
Here: H(X) = 5 bit
![Page 19: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/19.jpg)
1919
Conditional entropy (1)Conditional entropy (1)
The sky is blu_
How many binary questions? 5?
No! Why? What’s wrong?
The context tells us “something” about the missing letter X
![Page 20: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/20.jpg)
2020
Conditional entropy (2)Conditional entropy (2)
Given a text composed from an alphabet of 32 letters (each letter equally probable)
Person A chooses a letter X (randomly) Person B wants to know this letter B may ask only binary questions A may tell B the letter Y preceding X
E.g. L_ Q_
Question: how many binary questions must B ask in order to learn which letter X was chosen by A
Answer: conditional entropy H(X|Y)
![Page 21: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/21.jpg)
2121
Conditional entropy (3)Conditional entropy (3)
H(X|Y) <= H(X)
Clear!
In worst case – namely if B ignores all “information” in Y about X – B needs H(X) binary questions
Under no circumstances should B need more than H(X) binary questions
Knowledge of Y cannot increase the number of binary questions
Knowledge can never harm! (mathematical statement, perhaps not true in real life )
![Page 22: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/22.jpg)
2222
Mutual information (1)Mutual information (1)
Compare two situations:
I: learn X without knowing Y II: learn X with knowing Y
How many binary questions in case of I? H(X) How many binary questions in case of II? H(X|Y)
Question: How many binary questions could B save in case of II? Question: How many binary questions could B save by knowing
Y?
Answer: I(X;Y) = H(X) – H(X|Y)
I(X;Y) = information in Y about X
![Page 23: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/23.jpg)
2323
Mutual information (2)Mutual information (2)
H(X|Y) <= H(X) I(X;Y) >= 0
In worst case – namely if B ignores all information in Y about X or if there is no information in Y about X – then I(X;Y) = 0
Information in Y about X can never be negative
Knowledge can never harm! (mathematical statement, perhaps not true in real life )
![Page 24: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/24.jpg)
2424
Mutual information (3)Mutual information (3)
Example 1: random sequence composed of A, C, G, T (equally probable)
I(X;Y) = ?
H(X) = 2 bit H(X|Y) = 2 bit I(X;Y) = H(Y) – H(X|Y) = 0 bit
Example 2: deterministic sequence … ACGT ACGT ACGT ACGT …
I(X;Y) = ?
H(X) = 2 bit H(X|Y) = 0 bit I(X;Y) = H(Y) – H(X|Y) = 2 bit
![Page 25: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/25.jpg)
2525
Mutual information (4)Mutual information (4)
I(X;Y) = I(Y;X) Always! For any X and any Y! Information in Y about X = information in X about Y
Examples: How much information is there in the amino acid sequence about
the secondary structure? How much information is there in the secondary structure about the amino acid sequence?
How much information is there in the expression profile about the function of the gene? How much information is there in the function of the gene about the expression profile?
Mutual information
![Page 26: An intuitive introduction to information theory](https://reader036.fdocuments.in/reader036/viewer/2022062519/568150b1550346895dbece7e/html5/thumbnails/26.jpg)
2626
SummarySummary
Entropy Conditional entropy Mutual information
There is no such thing as information content Information not defined for a single variable 2 random variables needed to talk about information Information in Y about X
I(X;Y) = I(Y;X) info in Y about X = info in X about Y
I(X;Y) >= 0 information never negative knowledge cannot harm
I(X;Y) = 0 if and only if X and Y statistically independent I(X;Y) > 0 otherwise