Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward...
-
Upload
frederick-morris-welch -
Category
Documents
-
view
213 -
download
0
Transcript of Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward...
![Page 1: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/1.jpg)
Cmprssd Vw f Infrmtn Thry: A Compressed View of Information
John Woodward [email protected]
1. Is a picture really worth 1000 words? 2. Does the Complete Works of Shakespeare contain
more information in its original language or a translation?
3. Why is tossing a coin the best way to make a decision? 4. What is your best defence when interrogated? 5. Why is the original scientific paper outlining
information theory still relevant?
![Page 2: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/2.jpg)
Information Age• probability / coding theory • Transmit, share, copy, digest, delete,
evaluate, interpret, value, ignore1. Shannon entropy is concerned
with the transmission of a message 2. Algorithmic information theory is
concerned with the information content of the message itself.
1948
![Page 3: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/3.jpg)
The diving bell and the butterfly
![Page 4: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/4.jpg)
The diving bell and the butterfly
ABCDEFGHIJKLMNOPQRSTUVWXYZMove you finger L -> RA is 1 time unitB is 2C is 3Z is 26e.g. “BUT” 2 + 21 + 20 “SECONDS”
![Page 5: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/5.jpg)
The diving bell and the butterfly
ABCDEFGHIJKLMNOPQSTUVWXYZ
![Page 6: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/6.jpg)
Frequency of a Symbol• Typewriter QWERTY• Computer QWERTY???
![Page 7: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/7.jpg)
Frequency of a Symbol• Typewriter QWERTY• Computer QWERTY???
• Megabee HAWKING movie https://www.youtube.com/watch?v=BtMeI3xGtcM
![Page 8: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/8.jpg)
![Page 9: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/9.jpg)
Morse Code
• How many symbols are in the Morse code?
• 1, 2, 3, 4, 5
https://www.youtube.com/watch?v=Z5uyK5MrsTs
![Page 10: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/10.jpg)
Morse Code
• Contains 4 symbols.• Morse did basic frequency
(probabilistic) analysis.• Within 15% of optimum.
https://www.youtube.com/watch?v=Z5uyK5MrsTs
![Page 11: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/11.jpg)
Morse code tree• 3 gaps• Most frequent letters
THESE SHOULD BE THE KEYS ON YOUR COMPUTER
![Page 12: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/12.jpg)
Morse code tree1) … --- … 2) --- -- --.
![Page 13: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/13.jpg)
Morse code tree1) … --- … SOS 2) --- -- --. OMG
![Page 14: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/14.jpg)
Coin Tossing
• A fair coin• A double headed/ tailed
coin• Gambler’s fallacy – each
toss is independent. – Symmetric, – monotonic
![Page 15: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/15.jpg)
Coin Tossing
• A fair coin• A double headed/ tailed
coin• Gambler’s fallacy – each
toss is independent. – Symmetric, – monotonic
![Page 16: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/16.jpg)
Making a Decision• If you cannot make a
rational decision …toss a fair coin. MAXIMUM ENTROPY
• This has maximum “surprize” or least predictability.
• With a friend – chocolate cake or broccoli.
• http://en.wikipedia.org/wiki/The_Dice_Man – makes decisions by rolling a dice.
![Page 17: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/17.jpg)
Police Interview
1. Police may ask you to repeat your statement. Why.
2. This is a tactic3. Or they may ask you for details in
a different order. 4. “no comment interview” 5. MINIMUM ENTROPY6. https
://www.youtube.com/watch?v=q4f_vi7yKuU
7. What is the information content?8. Neither confirm nor deny
![Page 18: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/18.jpg)
Linguistics. 1. zs, td, pb, fv ???2. `` – how much info?3. shorthand. 4. Lip reading.
![Page 19: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/19.jpg)
Linguistics. 1. zs, td, pb, fv ???2. `` – how much info?3. shorthand. 4. Lip reading.
![Page 20: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/20.jpg)
Genetic Code 1
4 BASES A-U C-G20 AMINO ACIDS
![Page 21: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/21.jpg)
Genetic CodeGenetic Code 2
![Page 22: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/22.jpg)
Genetic Code 3
1. No gaps2. Use 3 bases (ATCG) not 2 or 4 for 21 code
words (20 amino acids + stop)3. Instantaneous – needed! 4. Even if mistake is made in last base – often
okay – grouped (locality/redundancy)5. Even if wrong amino acid – still has similar
chemical properties.
![Page 23: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/23.jpg)
Half Time
• Transmitting information – Shannon entropy.• Algorithmic complexity – the information in
the message.
![Page 24: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/24.jpg)
Lossless/Lossy Compression
• https://www.youtube.com/watch?v=QEzhxP-pdos
![Page 25: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/25.jpg)
Compress A File• Not all strings are compressible. • We want to compress all bit
strings of length 3, to be shorter. • proof – pigeon hole principle.• In fact most strings are not
compressible – “RENAMING”
“” 0 1 00 01 10 11
0 0 0 00 0 1 10 1 0 20 1 1 31 0 0 41 0 1 51 1 0 61 1 1 7
7 Bit strings of length <= 2
8 bit strings length = 3
![Page 26: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/26.jpg)
Kolmogorov Complexity
“0000000000000…”“0010010010010…”“1011010010110…”
![Page 27: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/27.jpg)
Kolmogorov Complexity
“0000000000000…” repeat 60 times “0”“0010010010010…” repeat 20 times “001”“1011010010110…” print “1011010010110…”
![Page 28: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/28.jpg)
Kolmogorov Complexity
“0000000000000…” repeat 60 times “0”“0010010010010…” repeat 20 times “001”“1011010010110…” print “1011010010110…”
1. According to probability theory they are all equally likely? 2. If there is a pattern, we can write a rule and implement it on a computer. 3. Kolmogorov complexity of a bit string is the length of the shortest
computer program to print the string and halt. 4. Can be thought of as a measure of compressibility. 5. Amazing fact – it is independent of the computer you run it on. 6. Which string above have high/low Kolmogorov complexity???7. Kolmogorov complexity is a generalization of Shannon entropy.
![Page 29: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/29.jpg)
Information and Translating
• Which contains more “information”– 5,6,4…– five, six, four, …
• Now consider Shakespeare in English and German.
• If we translate word for word – the number of pages would increase (10%).
• If we have a dictionary – this is a “one off cost” in principle – an increase of fixed amount!!!!
digit word1 one 2 two 3 three4 four
… …
![Page 30: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/30.jpg)
2nd Law of Thermodynamics
1. Entropy (disorder) increases (statistically) (closed system)
2. things naturally become untidy (definition of untidy?).
3. Only irreversible law of physics 4. S = K log W5. W = number microstates
corresponding to that macrostate (ratio)
![Page 31: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/31.jpg)
2nd law e.g. Vibrate 2 Dice on a Tray
• Microstate• Values on
each dice• E.g. 3,5• Macrostate• Sum• E.g. 8=3+5• probabilities
![Page 32: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/32.jpg)
Maxwell’s Demon 1860s The demon can separate the atoms. ENERGY FOR FREE
![Page 33: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/33.jpg)
Maxwell’s Demon 2
• We can do work (energy) on the gas by compressing either piston. Log v1/v2
• We can half push the pistons in order e.g. 010 (left, right, left)
• We have reduced the entropy of the gas.
• K log (#microstates)• 3 bits of information
• Divide cylinder into 8
![Page 34: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/34.jpg)
Experimental verification of Landauer’s principle
• Irreversible transformation K T ln 2 (delete a BIT)• Nature 483, 187–189 (08 March
2012) doi:10.1038/nature10872Received 11 October 2011 Accepted 17 January 2012 Published online 07 March 2012
• http://www.nature.com/nature/journal/v483/n7388/full/nature10872.html
![Page 35: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/35.jpg)
Bald Man
• What does he say to the barber each time.
• How much “information” is contained in his hair
![Page 36: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/36.jpg)
Which book?
![Page 37: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/37.jpg)
Which book?
• Toss a coin!!!!
![Page 38: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/38.jpg)
- …. .. -. -..
![Page 39: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/39.jpg)
- …. .. -. -.. The end
![Page 40: Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward jrw@cs.stir.ac.ukjrw@cs.stir.ac.uk 1.Is a picture really worth 1000 words? 2.Does.](https://reader038.fdocuments.in/reader038/viewer/2022110206/56649cdc5503460f949a795b/html5/thumbnails/40.jpg)
Next Public Lecture
• http://www.maths.stir.ac.uk/lectures/ • 2nd April (2 weeks)• Ant hills, traffic jams and social segregation:
modelling the world from the bottom upDr Savi Maharaj