SlideShare a Scribd company logo
Markov Chain Analysis
What is Markov Model?
• In probability theory, a Markov model is a stochastic model used
to model randomly changing systems where it is assumed that future
states depend only on the present state and not on the sequence of
events that preceded it (that is, it assumes the Markov property).
Generally, this assumption enables reasoning and computation with the
model that would otherwise be intractable.
• Some Examples are:
– Snake & ladder game
– Weather system
.
Assumptions for Markov model
• a fixed set of states,
• fixed transition probabilities, and the possibility of
getting from any state to another through a series of
transitions.
• a Markov process converges to a unique distribution
over states. This means that what happens in the long
run won’t depend on where the process started or on
what happened along the way.
• What happens in the long run will be completely
determined by the transition probabilities – the
likelihoods of moving between the various states.
Types of Markov models & when to
use which model
System state is fully
observable
System state is partially
observable
System is autonomous Markov Chain Hidden Markov Model
System is controlled Markov Decision Process Partially observable
Markov decision process
Source:wikipedia
Markov chain
• Here system states are observable and fully
autonomous.
• Simplest of all Markov models.
• Markov chain is a random process that undergoes
transitions from one state to another on a state space.
• It is required to possess a property that is usually
characterized as "memoryless": the probability
distribution of the next state depends only on the
current state and not on the sequence of events that
preceded it.
• Also remember we are considering that time is moving
in discrete steps.
Lets try to understand Markov chain
from very simple example
• Weather:
• raining today 40% rain tomorrow
• 60% no rain tomorrow
• not raining today 20% rain tomorrow
• 80% no rain tomorrow
rain no rain
0.60.4 0.8
0.2
Stochastic Finite State Machine:
7
Weather:
• raining today 40% rain tomorrow
60% no rain tomorrow
• not raining today 20% rain tomorrow
80% no rain tomorrow
Markov Process
Simple Example







8.02.0
6.04.0
P
• Stochastic matrix:
Rows sum up to 1
• Double stochastic matrix:
Rows and columns sum up to 1
The transition matrix:
Rain No rain
Rain
No rain
Markov Process
• Markov Property: Xt+1, the state of the system at time t+1 depends
only on the state of the system at time t
X1 X2 X3 X4 X5
   x| XxXxxX| XxX tttttttt   111111 PrPr 
• Stationary Assumption: Transition probabilities are independent of
time (t)
 1Pr t t abX b| X a p   
Let Xi be the weather of day i, 1 <= i <= t. We may
decide the probability of Xt+1 from Xi, 1 <= i <= t.
9
– Gambler starts with $10 (the initial state)
- At each play we have one of the following:
• Gambler wins $1 with probability p
• Gambler looses $1 with probability 1-p
– Game ends when gambler goes broke, or gains a fortune of $100
(Both 0 and 100 are absorbing states)
0 1 2 99 100
p p p p
1-p 1-p 1-p 1-p
Start
(10$)
Markov Process
Gambler’s Example
1-p
10
• Markov process - described by a stochastic FSM
• Markov chain - a random walk on this graph
(distribution over paths)
• Edge-weights give us
• We can ask more complex questions, like
Markov Process
 1Pr t t abX b| X a p   
  ?Pr 2  ba | XX tt
0 1 2 99 100
p p p p
1-p 1-p 1-p 1-p
Start
(10$)
11
• Given that a person’s last cola purchase was Coke,
there is a 90% chance that his next cola purchase will
also be Coke.
• If a person’s last cola purchase was Pepsi, there is
an 80% chance that his next cola purchase will also be
Pepsi.
coke pepsi
0.10.9 0.8
0.2
Markov Process
Coke vs. Pepsi Example







8.02.0
1.09.0
P
transition matrix:
coke pepsi
coke
pepsi







8.02.0
1.09.0
P
12
Given that a person is currently a Pepsi purchaser,
what is the probability that he will purchase Coke two
purchases from now?
Pr[ Pepsi?Coke ] =
Pr[ PepsiCokeCoke ] + Pr[ Pepsi Pepsi Coke ] =
0.2 * 0.9 + 0.8 * 0.2 = 0.34



















66.034.0
17.083.0
8.02.0
1.09.0
8.02.0
1.09.02
P
Markov Process
Coke vs. Pepsi Example (cont)
Pepsi  ? ?  Coke
13
Given that a person is currently a Coke purchaser,
what is the probability that he will buy Pepsi at the
third purchase from now?
Markov Process
Coke vs. Pepsi Example (cont)



















562.0438.0
219.0781.0
66.034.0
17.083.0
8.02.0
1.09.03
P
14
•Assume each person makes one cola purchase per week
•Suppose 60% of all people now drink Coke, and 40% drink Pepsi
•What fraction of people will be drinking Coke three weeks from now?
Markov Process
Coke vs. Pepsi Example (cont)







8.02.0
1.09.0
P 






562.0438.0
219.0781.03
P
Pr[X3=Coke] = 0.6 * 0.781 + 0.4 * 0.438 = 0.6438
Qi - the distribution in week i
Q0= (0.6,0.4) - initial distribution
Q3= Q0 * P3 =(0.6438,0.3562)
15
Simulation:
Markov Process
Coke vs. Pepsi Example (cont)
week - i
Pr[Xi=Coke]
2/3
   3
1
3
2
3
1
3
2
8.02.0
1.09.0






stationary distribution
coke pepsi
0.10.9 0.8
0.2
Supervised vs Unsupervised
 Decision tree learning is “supervised
learning” as we know the correct output of
each example.
 Learning based on Markov chains is
“unsupervised learning” as we don’t know
which is the correct output of “next letter”.
16
Implementation Using R
 msm (Jackson 2011) :handles Multi-State Models
for panel data;
 mcmcR (Geyer and Johnson 2013) implements
Monte Carlo Markov Chain approach;
 hmm (Himmelmann and www.linhi.com 2010) fits
hidden Markov models with covariates;
 mstate fits Multi-State Models based on Markov
chains for survival analysis (de Wreede, Fiocco,
and Putter 2011).
 markovchain
17
Implementaion using R
 Example1:Weather Prediction:
The Land of Oz is acknowledged not to have ideal
weather conditions at all: the weather is snowy or rainy
very often and, once more, there are never two nice days
in a row. Consider three weather states: rainy, nice and
snowy, Given that today it is a nice day, the corresponding
stochastic row vector is w0 = (0 , 1 , 0) and the forecast
after 1, 2 and 3 days.
Solution: please refer solution.R attached.
18
Source
 Slideshare
 Wikipedia
 Google
19

More Related Content

Markov chain

  • 2. What is Markov Model? • In probability theory, a Markov model is a stochastic model used to model randomly changing systems where it is assumed that future states depend only on the present state and not on the sequence of events that preceded it (that is, it assumes the Markov property). Generally, this assumption enables reasoning and computation with the model that would otherwise be intractable. • Some Examples are: – Snake & ladder game – Weather system .
  • 3. Assumptions for Markov model • a fixed set of states, • fixed transition probabilities, and the possibility of getting from any state to another through a series of transitions. • a Markov process converges to a unique distribution over states. This means that what happens in the long run won’t depend on where the process started or on what happened along the way. • What happens in the long run will be completely determined by the transition probabilities – the likelihoods of moving between the various states.
  • 4. Types of Markov models & when to use which model System state is fully observable System state is partially observable System is autonomous Markov Chain Hidden Markov Model System is controlled Markov Decision Process Partially observable Markov decision process Source:wikipedia
  • 5. Markov chain • Here system states are observable and fully autonomous. • Simplest of all Markov models. • Markov chain is a random process that undergoes transitions from one state to another on a state space. • It is required to possess a property that is usually characterized as "memoryless": the probability distribution of the next state depends only on the current state and not on the sequence of events that preceded it. • Also remember we are considering that time is moving in discrete steps.
  • 6. Lets try to understand Markov chain from very simple example • Weather: • raining today 40% rain tomorrow • 60% no rain tomorrow • not raining today 20% rain tomorrow • 80% no rain tomorrow rain no rain 0.60.4 0.8 0.2 Stochastic Finite State Machine:
  • 7. 7 Weather: • raining today 40% rain tomorrow 60% no rain tomorrow • not raining today 20% rain tomorrow 80% no rain tomorrow Markov Process Simple Example        8.02.0 6.04.0 P • Stochastic matrix: Rows sum up to 1 • Double stochastic matrix: Rows and columns sum up to 1 The transition matrix: Rain No rain Rain No rain
  • 8. Markov Process • Markov Property: Xt+1, the state of the system at time t+1 depends only on the state of the system at time t X1 X2 X3 X4 X5    x| XxXxxX| XxX tttttttt   111111 PrPr  • Stationary Assumption: Transition probabilities are independent of time (t)  1Pr t t abX b| X a p    Let Xi be the weather of day i, 1 <= i <= t. We may decide the probability of Xt+1 from Xi, 1 <= i <= t.
  • 9. 9 – Gambler starts with $10 (the initial state) - At each play we have one of the following: • Gambler wins $1 with probability p • Gambler looses $1 with probability 1-p – Game ends when gambler goes broke, or gains a fortune of $100 (Both 0 and 100 are absorbing states) 0 1 2 99 100 p p p p 1-p 1-p 1-p 1-p Start (10$) Markov Process Gambler’s Example 1-p
  • 10. 10 • Markov process - described by a stochastic FSM • Markov chain - a random walk on this graph (distribution over paths) • Edge-weights give us • We can ask more complex questions, like Markov Process  1Pr t t abX b| X a p      ?Pr 2  ba | XX tt 0 1 2 99 100 p p p p 1-p 1-p 1-p 1-p Start (10$)
  • 11. 11 • Given that a person’s last cola purchase was Coke, there is a 90% chance that his next cola purchase will also be Coke. • If a person’s last cola purchase was Pepsi, there is an 80% chance that his next cola purchase will also be Pepsi. coke pepsi 0.10.9 0.8 0.2 Markov Process Coke vs. Pepsi Example        8.02.0 1.09.0 P transition matrix: coke pepsi coke pepsi
  • 12.        8.02.0 1.09.0 P 12 Given that a person is currently a Pepsi purchaser, what is the probability that he will purchase Coke two purchases from now? Pr[ Pepsi?Coke ] = Pr[ PepsiCokeCoke ] + Pr[ Pepsi Pepsi Coke ] = 0.2 * 0.9 + 0.8 * 0.2 = 0.34                    66.034.0 17.083.0 8.02.0 1.09.0 8.02.0 1.09.02 P Markov Process Coke vs. Pepsi Example (cont) Pepsi  ? ?  Coke
  • 13. 13 Given that a person is currently a Coke purchaser, what is the probability that he will buy Pepsi at the third purchase from now? Markov Process Coke vs. Pepsi Example (cont)                    562.0438.0 219.0781.0 66.034.0 17.083.0 8.02.0 1.09.03 P
  • 14. 14 •Assume each person makes one cola purchase per week •Suppose 60% of all people now drink Coke, and 40% drink Pepsi •What fraction of people will be drinking Coke three weeks from now? Markov Process Coke vs. Pepsi Example (cont)        8.02.0 1.09.0 P        562.0438.0 219.0781.03 P Pr[X3=Coke] = 0.6 * 0.781 + 0.4 * 0.438 = 0.6438 Qi - the distribution in week i Q0= (0.6,0.4) - initial distribution Q3= Q0 * P3 =(0.6438,0.3562)
  • 15. 15 Simulation: Markov Process Coke vs. Pepsi Example (cont) week - i Pr[Xi=Coke] 2/3    3 1 3 2 3 1 3 2 8.02.0 1.09.0       stationary distribution coke pepsi 0.10.9 0.8 0.2
  • 16. Supervised vs Unsupervised  Decision tree learning is “supervised learning” as we know the correct output of each example.  Learning based on Markov chains is “unsupervised learning” as we don’t know which is the correct output of “next letter”. 16
  • 17. Implementation Using R  msm (Jackson 2011) :handles Multi-State Models for panel data;  mcmcR (Geyer and Johnson 2013) implements Monte Carlo Markov Chain approach;  hmm (Himmelmann and www.linhi.com 2010) fits hidden Markov models with covariates;  mstate fits Multi-State Models based on Markov chains for survival analysis (de Wreede, Fiocco, and Putter 2011).  markovchain 17
  • 18. Implementaion using R  Example1:Weather Prediction: The Land of Oz is acknowledged not to have ideal weather conditions at all: the weather is snowy or rainy very often and, once more, there are never two nice days in a row. Consider three weather states: rainy, nice and snowy, Given that today it is a nice day, the corresponding stochastic row vector is w0 = (0 , 1 , 0) and the forecast after 1, 2 and 3 days. Solution: please refer solution.R attached. 18