Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex...
Transcript of Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex...
![Page 1: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/1.jpg)
Markov ChainsCamille Crumpton, Sisi Xiong, Tyler McDaniel
![Page 2: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/2.jpg)
Questions• What is the matrix A given the following graph when calculating
PageRank according to the simple definition of PageRank?
0
1 3
2
• What is the first name of the man credited for Markov Chains?
• What is one application that you have used personally of Markov Chains?
![Page 3: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/3.jpg)
Tyler McDaniel
• From Asheville, NC
• CS PhD student; work on high-perf dense linear algebra libraries at ICL
• Prior work with graph algorithms in the financial sector
• Also worked in microcontrollers/IoT
![Page 4: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/4.jpg)
Sisi Xiong• Changde, Hunan, China
• PhD student majoring in Electrical Engineering.
• Research interests: processing large datasets using probabilistic approaches: Bloom filters
![Page 5: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/5.jpg)
Camille Crumpton
• Hometown: Knoxville/Maryville, TN
![Page 6: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/6.jpg)
Camille Crumpton
• Computer Science graduate student
• Harpist
• Endurance athlete
![Page 7: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/7.jpg)
Outline• Overview of Markov Chains
• History of Markov Chains
• PageRank – Most Famous Use of Markov Chains• Algorithm• Implementation• Experiments
• Text Algorithm• Algorithm• Implementation• Applications – “Fun” uses!• Experiments
• More applications• Population processes• Adaptive cruise control
• Open Problem
• Discussion
![Page 8: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/8.jpg)
Overview of Markov Chains
• What is a Markov Chain?• A discrete-time stochastic process that satisfies the Markov
property
• Then what is the Markov property?• If one can make predictions about the future of the process
with only having the knowledge of the present state (no knowledge of past states)
• Sometimes called “memoryless” property
![Page 9: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/9.jpg)
Overview of Markov Chains
• Often represented as a graph with probabilities as weights
• In the example to the right, each edge weight represents the probability of the Markov process changing from one state to another
• Can think of a Markov Chain as a stochastic (probability-driven) finite state machine
![Page 10: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/10.jpg)
Overview of Markov Chains
![Page 11: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/11.jpg)
Overview of Markov Chains
• Can also be represented as a transition matrix
1 2 3 4
1 0.6 0.2 0 0.2
2 0.6 0.4 0 0
3 0 0.3 0.4 0.3
4 0.3 0 0 0.7
![Page 12: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/12.jpg)
Overview of Markov Chains: Simple Example
• Let’s use a Markov Chain to make a simple prediction of the weather:
• Raining today 40% rain tomorrow
60% sunny tomorrow
• Sunny today 20% rain tomorrow
80% sunny tomorrow
![Page 13: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/13.jpg)
Overview of Markov Chains: Simple Example
• http://setosa.io/ev/markov-chains/
![Page 14: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/14.jpg)
Overview of Markov Chains: More Complexity
![Page 15: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/15.jpg)
History behind Markov Chains
![Page 16: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/16.jpg)
Andrey Andreyevich Markov
• Born: June 1856
• Death: July 1922 (66 years old)
• Russian mathematician
• Alma mater: St. Petersburg University
• Asked to stay at St. Petersburg University as a researcher upon graduation
• Studied stochastic processes
![Page 17: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/17.jpg)
Markov Chain Beginnings: Probability + Poetry
• Andrey Markov spent hours perusing through Alexander Pushkin’s novel in verse Eugene Onegin
• Discovered patterns in certain letters following other letters
• Created an analysis of vowel and consonant patterns – the precursor to Markov Chains
http://www.americanscientist.org/issues/pub/first-links-in-the-markov-chain
![Page 18: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/18.jpg)
Markov Chain Beginnings: Probability + Poetry
• ”If the current letter I’m reading is a vowel, what is the probability that the next letter I’m reading is a vowel? • A consonant?
• What about three letters later?
• Ten letters later?
http://www.americanscientist.org/issues/pub/first-links-in-the-markov-chain
![Page 19: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/19.jpg)
Markov Chain Beginnings: Probability + Poetry
"My uncle's shown his good intentionsBy falling desperately ill;His worth is proved; of all inventionsWhere will you find one better still?He's an example, I'm averring;But, God, what boredom -- there, unstirring,By day, by night, thus to be bidTo sit beside an invalid!Low cunning must assist devotionTo one who is but half-alive:You puff his pillow and contriveAmusement while you mix his potion;You sign, and think with furrowed brow --'Why can't the devil take you now?' "
http://www.americanscientist.org/issues/pub/first-links-in-the-markov-chain
![Page 20: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/20.jpg)
Markov Chain Beginnings: Probability + Poetry
• Andrey Markov created a new branch of probability with his findings, now known as Markov chains
• Extended probability beyond coin flipping & dice rolling (independent events) – to chains of linked events (what happens next depends on current state of the system)
http://www.americanscientist.org/issues/pub/first-links-in-the-markov-chain
![Page 21: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/21.jpg)
Markov Chain Beginnings: Probability + Poetry
![Page 22: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/22.jpg)
Markov Chains: Two Algorithms/Applications
• PageRank• Application – we use it everyday!• Algorithm• Implementation• Experiments
• Markov Chain Text Algorithm (a.k.a. Markov Chain Algorithm)• Algorithm• Implementation• Applications• Experiment
![Page 23: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/23.jpg)
PageRank: Introduction
• Inputs• A bunch of webpages (vertices), each of which has links to other
webpages(edges).
• Directed graph!
A…C…
B ……C
C…DE
D
E
![Page 24: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/24.jpg)
PageRank: Introduction
• Purpose• Rank all webpages in terms of “importance”.
• How to define “importance”• Analogy: citation, papers with more citations
are more important.
• Option: count how many backlinks a webpage has.
• Caveat: if a page has a backlink from an “important” page, it also should be somewhat important.
• Weighted directed graph!
![Page 25: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/25.jpg)
PageRank: Introduction
• Original Paper• Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank citation
ranking: Bringing order to the web. Stanford Info Lab.
• PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites.*
*https://web.archive.org/web/20111104131332/https://www.google.com/competition/howgooglesearchworks.html
![Page 26: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/26.jpg)
PageRank: Definition• Intuitively
• A webpage has high rank if the sum of the ranks of its backlinks is large.
• Simple definition• u: a webpage
• R(u): rank of u
• F(u): the set of webpages that u points to.
• B(u): the set of webpages that point to u.
• N(u): the number of links from u (forward links).
• c: a factor for normalization, i.e., a constant.
𝑅 𝑢 = 𝑐
𝑣∈𝐵𝑢
𝑅(𝑣)
𝑁𝑣
![Page 27: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/27.jpg)
PageRank: Definition
• Construct a n by n matrix A:
• 𝐴𝑢,𝑣 = 1/𝑁𝑣
0
• R = [R(1), R(2), …, R(n)]
• R = cAR
• R is an eigenvalue of A with eigenvalue c.
𝑅 𝑢 = 𝑐
𝑣∈𝐵𝑢
𝑅(𝑣)
𝑁𝑣
, if there is an edge from v to u
, otherwise
![Page 28: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/28.jpg)
PageRank: Example
𝑅0
𝑅1
𝑅2
𝑅3
=
0 0 0 11/2 0 0 01/2 1/2 0 00 1/2 1 0
×
𝑅0
𝑅1
𝑅2
𝑅3
0
1 3
2
𝐴𝑢,𝑣 = 1/𝑁𝑣
0
, if there is an edge from v to u
, otherwise
𝑅0 = 1 × 𝑅3 𝑅1 =1
2× 𝑅0
𝑅2 =1
2× 𝑅0 +
1
2× 𝑅1 𝑅3 =
1
2× 𝑅1 + 1 × 𝑅2
![Page 29: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/29.jpg)
PageRank: Rank sink issue
• Loop (u, v) accumulates rank, but never distributes rank (no forward links).
• Modified PageRank• E(u) is some vector that is considered as a source of rank
u v
w
𝑅′ 𝑢 = 𝑐
𝑣∈𝐵𝑢
𝑅′(𝑣)
𝑁𝑉+ 𝑐𝐸(𝑢)
![Page 30: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/30.jpg)
PageRank: Random Surfer Model
• A “random surfer” keeps clicking on successive links randomly.
• If entering a rank sink, the surfer “gets bored” and jumps to a random page.
𝑅′ 𝑢 = 𝑐
𝑣∈𝐵𝑢
𝑅′(𝑣)
𝑁𝑉+1 − 𝑐
𝑁, c = 0.85
u v
w
x
![Page 31: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/31.jpg)
PageRank: Markov chain
• The sum of values in each column in A is 1.
• A is a Markov matrix!
• All probability is non-negative, it’s guaranteed there is a steady-state vector.
• PageRank always has a steady state.
𝑅0
𝑅1
𝑅2
𝑅3
=
0 0 0 11/2 0 0 01/2 1/2 0 00 1/2 1 0
×
𝑅0
𝑅1
𝑅2
𝑅3
![Page 32: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/32.jpg)
PageRank: Convergence properties
• Experiments on a database based on 322 million links• 52 iterations.
• Scale very well.
![Page 33: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/33.jpg)
PageRank: Implementation
• R(0) = [1/N, 1/N, …, 1/N]
• e = 0.0001
• delta = 0
• while true
• 𝑅𝑖+1 = 0.85𝐴𝑅𝑖 +0.15
𝑁• delta = 𝑅𝑖+1 − 𝑅𝑖 1
• if delta < e• break
• 𝑅𝑖 = 𝑅𝑖+1
• return 𝑅𝑖
![Page 34: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/34.jpg)
PageRank: Experiment configuration
• Datasets• Test graphs from homework 2.
• Several graphs from http://snap.stanford.edu/.
• Metrics• Convergence speed, in terms of iterations.
• PageRank results: sparse vs dense graphs.
• PageRank results based on real graphs.
![Page 35: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/35.jpg)
PageRank: Experiment results
Iterations to convergence of all test graphs from hw2
EV
Observation: 1. Results of all graphs converge in less than 20 iterations.2. No obvious pattern between convergence speed and graph size
![Page 36: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/36.jpg)
PageRank: Experiment results
PageRank results comparison between dense and sparse graphs
Observation: PageRank of results of dense graphs are more spread out.
![Page 37: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/37.jpg)
PageRank: Experiment results
• Why there is a jump there?
• Suppose if a webpage (vertex) has no backlinks (no webpage has links to it), the only rank source comes from “random jumps”, which is a relatively small probability.
• Hypothesis: If a vertex has no backlinks, its page rank is the smallest.
• Experiment results: matching!
Graph Random.500.01.txt Random.1000.01.txt
Number of webpages whichhave smallest PageRank
101 104
Number of webpages which have no backlinks
101 104
![Page 38: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/38.jpg)
PageRank: Experiment results
http://snap.stanford.edu/data/wiki-Vote.html http://snap.stanford.edu/data/cit-HepTh.html
Citation graph from the e-print arXivV = 27,770, E = 352,807
iterations = 17
Wikipedia administer vote graphV = 7,115, E = 103,689
iterations = 17
Number of vertices which have no backlinks = 4734
Number of vertices which have no backlinks = 4590
![Page 39: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/39.jpg)
Next Algorithm/Application! Markov Chain Text
![Page 40: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/40.jpg)
Markov Chain Text (Intro)
• Process input text, analyzing letter/word (token) probabilities
• Create graph (transition matrix) to represent those probabilities
• Generate “similar” text using graph• Order: how many prior tokens to use when generating next token
• Example:• Input: Foo bar foo bar bar Foo Bar \n
1 .33
.33
.33
![Page 41: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/41.jpg)
Markov Chain Text (Algorithm)
• Simple (bare bones!) pseudocode for order 1:
• PROCESS:Create dictionary
For line in input:For token_1, token_2 in line:
Add to dictionary -> {token_1, token_2}
• GENERATE:Choose first word from dictionary -> token_1
For each desired word in output:Lookup token_2 in dictionary using token_1token_2 -> token_1
![Page 42: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/42.jpg)
Markov Chain Text (Implementation)
• Store probabilities for each word based on order; use transition matrix rather than dictionary.
• Gather metadata and use parsing rules to format output correctly (punctuation, titles before names, etc.)
• May be complemented by other methods…n-gram methods for partial words, “fat-finger” grace techniques
![Page 43: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/43.jpg)
Markov Chain Text (Applications)
• One method used in autocomplete software
• Also used in some botshttps://filiph.github.io/markov/
![Page 44: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/44.jpg)
Applications – Autocomplete on Phones
Alternative caption: ”Although the Markov-chain text model is still rudimentary, it recently gave me “Massachusetts Institute of America”. Although I have to admit it sounds prestigious.
![Page 45: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/45.jpg)
Applications – Autocomplete on Phones
• Most text messaging applications use a Markov Chain model to predict what word you wish to type next, based on the last word.
![Page 46: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/46.jpg)
Applications – Subreddit generation
• https://www.reddit.com/r/SubredditSimulator/comments/3g9ioz/what_is_rsubredditsimulator/
• Fully-automated subredditthat generates random submissions and comments using Markov chains
![Page 47: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/47.jpg)
Applications – Subreddit generation
• Explanation of the Markov Chain on Reddit:• “Basically, you feed in a bunch of sentences, and even though it has no
understanding of the meaning of the text, it picks up on patterns like "word A is often followed by word B". “
• “Then when you want to generate a new sentence, it "walks" its way through from the start of a sentence to the end of one, picking sequences of words that it knows are valid based on that initial analysis. “
• “So generally short sequences of words in the generated sentences will make sense, but often not the whole thing.”
![Page 48: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/48.jpg)
Markov Chain Text (Experiment)
• Writing a sonnet using Markov chain generated using Billy Shakespeare’s extant sonnets:
• C++ (Processed: .05 seconds, generated 100 lines: .91 seconds)
My days are
In thy wrong,
that beauty tempting her pretty wrongs that thereby beauty's form in this.
Black lines of that I better state out of thine
in their wills count the lesson true,
the other mine,
thou live twice in lease
find true that wild music sadly?
sweets dost deceive.
![Page 49: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/49.jpg)
Markov Chain Text (Experiment)
• Writing a sonnet using Markov chain generated using Billy Shakespeare’s extant sonnets:
• Python (Processed: .51 seconds, generated 100 lines: .71 seconds)
Hence, thou wouldst thou thy might
To bear greater wrong, than my invention quite
Dulling my friend
O for thy soul's imaginary sight
Presents thy dial's shady stealth mayst in their virtue answer Muse,
Make answer, this fair were born
And dost beguile the conquest of heaven it hath masked not false to ruminate
That on the world, unbless some other write to thy monument
When that thou age black ink my transgression bow
![Page 50: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/50.jpg)
Markov Chain Text (Experimental Results)
0
0.2
0.4
0.6
0.8
1
1.2
0 10000 20000 30000 40000 50000
Processing time (seconds) function of word count, C++ and Python
C++ Python
0
1
2
3
4
5
6
0 10000 20000 30000 40000 50000
Generation time (seconds) function of word count, C++ and Python
C++ Python
Results from Haswell i7, OS X El Capitan
![Page 51: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/51.jpg)
Markov Chain Text (Some Conclusions)
• Implementations were simple, sequential programs modified from existing public github repos.
• C++ version processed text much more quickly, as expected.
• Python version outperformed in generating text; why?• Implementation detail; script stores processed text into local SQL dictionary.
• C++ version operates on vectors of strings
![Page 52: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/52.jpg)
Markov Chains: More Applications
• Since the next state in a Markov chain is simply a function of the last state and a random variable, we can easily see applications for Markov chains…• Queue lengths in call centers
• Stresses on materials
• Waiting time in production facilities
• Inventories in supply chains
• Water levels in dams
• Stock prices
• … and more
![Page 53: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/53.jpg)
Population Process: Introduction
• Markov process• State: number of individuals in a population• Changes: addition or removal of individuals
• Applications• Ecology• Telecommunications • Queueing theory • Chemical kinetics
• An example: Birth-death process
![Page 54: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/54.jpg)
• A special case of continuous-time Markov process
• Two types of changes• Birth
• Death
Population process: Birth-death process
0 1 2 3 4 5
𝜆0 𝜆2 𝜆3 𝜆4𝜆1
𝜇1 𝜇3 𝜇4 𝜇5𝜇2
…
𝜆𝑖Birth rate: Death rate: 𝜇𝑖
![Page 55: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/55.jpg)
• Pure birth process
• Poisson process
Population process: Birth-death process
0 1 2 3 4 5
𝜆0 𝜆2 𝜆3 𝜆4𝜆1
…
0 1 2 3 4 5 …
𝜆 𝜆 𝜆 𝜆 𝜆
![Page 56: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/56.jpg)
Population dynamics: Birth-death process
𝑃𝑛 𝑡 + ℎ = 𝑃𝑛 𝑡 1 − 𝜆𝑛ℎ − 𝜇𝑛ℎ + 𝑃𝑛−1 𝑡 𝜆𝑛−1ℎ + 𝑃𝑛+1 𝑡 𝜇𝑛+1ℎ + 𝑜(ℎ)
time
Pop
ula
tio
n
t t+h
nn-1
n+1
𝑃𝑛 𝑡 + ℎ − 𝑃𝑛 𝑡
ℎ= − 𝜆𝑛 + 𝜇𝑛 𝑃𝑛 𝑡 + 𝜆𝑛−1𝑃𝑛−1 𝑡 + 𝜇𝑛+1𝑃𝑛+1 𝑡
−𝜆0 𝜆0𝜇1 −(𝜆1 + 𝜇1) 𝜆1
𝜇2 −(𝜆2 + 𝜇2) 𝜆3𝜇3 −(𝜆3 + 𝜇3) 𝜆4
𝜇4 −(𝜆4 + 𝜇5)
⋯
Transition rate matrix:
![Page 57: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/57.jpg)
Population dynamics: Birth-death process
• Applications• Predict extinction time
• Chemistry: • radioactive transformations: birth process
• New atom decay: death process
• Queueing theory• M/M/1 model
• M/M/k model
![Page 58: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/58.jpg)
Applications: Adaptive Cruise Control
• Some cruise-control on vehicles use Markov chains to calculate speed
• Operating environment is stochastic
• Road grade on route can also be seen as stochastic
![Page 59: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/59.jpg)
Applications: Adaptive Cruise Control
• Vehicle speed and following distance can be optimized for best-on-average fuel economy and optimized for travel time
• More in Optimization and Optimal Control in Automotive Systems
![Page 60: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/60.jpg)
Open Problem
![Page 61: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/61.jpg)
Open Problem: Cutoffs
• Stationary distribution: vector that represents the proportion of time a given Markov chain occupies each state, given enough time.
• Total variation distance: distance between two probability distributions is equal to the max distance assigned to any event by the distributions
• Mixing time: number of steps required for a chain for the distance from stationary to be small
![Page 62: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/62.jpg)
Open Problem: Cutoffs
• Build a Markov chain via a random walk over the symmetric group on n symbols
• The resulting chain’s stationary distribution will be uniform; this can be interpreted as a “fully mixed” deck.
• Define “cutoff” as the existence of a sequence c(n) such that as n approaches infinity, the distance to uniformity approaches zero.
• If we use a random-to-random shuffle, does such a cutoff exist?
![Page 63: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/63.jpg)
Open Problem: Cutoffs
http://pages.uoregon.edu/dlevin/MARKOV/markovmixing.pdf
![Page 64: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/64.jpg)
References
Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank citation ranking: Bringing order to the web. Stanford Info
Lab.
snap.stanford.edu
https://en.wikipedia.org/wiki/PageRank
https://math.dartmouth.edu/~pw/math100w13/li.pdf
https://en.wikipedia.org/wiki/Population_process
https://en.wikipedia.org/wiki/Birth%E2%80%93death_process
https://en.wikipedia.org/wiki/Queueing_theory
https://en.wikipedia.org/wiki/Andrey_Markov
http://setosa.io/ev/markov-chains/
http://www.suaybarslan.com/birthdeathprocess4datamodelling.pdf
Hayes, B. (2013). First Links in the Markov Chain. American Scientist, 101(2), 92.
Harald Waschl, Ilya Kolmanovsky, Maarten Steinbuch, Luigi del Re. Optimization and Optimal Control in Automative Systems.https://books.google.com/books?id=ebS5BQAAQBAJ&lpg=PA148&dq=how%20does%20cruise%20control%20work%20markov%20chains&pg=PA142#v=onepage&q=how%20does%20cruise%20control%20work%20markov%20chains&f=false
![Page 65: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/65.jpg)
Discussion
![Page 66: Markov Chains - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · •Hypothesis: If a vertex has no backlinks, its page rank is the smallest. •Experiment results: matching!](https://reader034.fdocuments.in/reader034/viewer/2022050602/5fa974e374ed931cdb11338e/html5/thumbnails/66.jpg)
Questions Revisited• What is the matrix A given the following graph when calculating
PageRank according to the simple definition of PageRank?
0
1 3
2
• What is the first name of the man credited for Markov Chains?
• What is one application that you have used personally of Markov Chains?