Markov chain analysis uses Markov models to analyze randomly changing systems where future states only depend on the present state, not past states. A Markov chain has a fixed set of states, transition probabilities between states, and will converge to a unique long-run distribution. Markov chains assume states are fully observable and the system is autonomous. Common examples include weather patterns and gambling. Markov chains can be modeled and simulated using R packages like msm and markovchain.
2. What is Markov Model?
• In probability theory, a Markov model is a stochastic model used
to model randomly changing systems where it is assumed that future
states depend only on the present state and not on the sequence of
events that preceded it (that is, it assumes the Markov property).
Generally, this assumption enables reasoning and computation with the
model that would otherwise be intractable.
• Some Examples are:
– Snake & ladder game
– Weather system
.
3. Assumptions for Markov model
• a fixed set of states,
• fixed transition probabilities, and the possibility of
getting from any state to another through a series of
transitions.
• a Markov process converges to a unique distribution
over states. This means that what happens in the long
run won’t depend on where the process started or on
what happened along the way.
• What happens in the long run will be completely
determined by the transition probabilities – the
likelihoods of moving between the various states.
4. Types of Markov models & when to
use which model
System state is fully
observable
System state is partially
observable
System is autonomous Markov Chain Hidden Markov Model
System is controlled Markov Decision Process Partially observable
Markov decision process
Source:wikipedia
5. Markov chain
• Here system states are observable and fully
autonomous.
• Simplest of all Markov models.
• Markov chain is a random process that undergoes
transitions from one state to another on a state space.
• It is required to possess a property that is usually
characterized as "memoryless": the probability
distribution of the next state depends only on the
current state and not on the sequence of events that
preceded it.
• Also remember we are considering that time is moving
in discrete steps.
6. Lets try to understand Markov chain
from very simple example
• Weather:
• raining today 40% rain tomorrow
• 60% no rain tomorrow
• not raining today 20% rain tomorrow
• 80% no rain tomorrow
rain no rain
0.60.4 0.8
0.2
Stochastic Finite State Machine:
7. 7
Weather:
• raining today 40% rain tomorrow
60% no rain tomorrow
• not raining today 20% rain tomorrow
80% no rain tomorrow
Markov Process
Simple Example
8.02.0
6.04.0
P
• Stochastic matrix:
Rows sum up to 1
• Double stochastic matrix:
Rows and columns sum up to 1
The transition matrix:
Rain No rain
Rain
No rain
8. Markov Process
• Markov Property: Xt+1, the state of the system at time t+1 depends
only on the state of the system at time t
X1 X2 X3 X4 X5
x| XxXxxX| XxX tttttttt 111111 PrPr
• Stationary Assumption: Transition probabilities are independent of
time (t)
1Pr t t abX b| X a p
Let Xi be the weather of day i, 1 <= i <= t. We may
decide the probability of Xt+1 from Xi, 1 <= i <= t.
9. 9
– Gambler starts with $10 (the initial state)
- At each play we have one of the following:
• Gambler wins $1 with probability p
• Gambler looses $1 with probability 1-p
– Game ends when gambler goes broke, or gains a fortune of $100
(Both 0 and 100 are absorbing states)
0 1 2 99 100
p p p p
1-p 1-p 1-p 1-p
Start
(10$)
Markov Process
Gambler’s Example
1-p
10. 10
• Markov process - described by a stochastic FSM
• Markov chain - a random walk on this graph
(distribution over paths)
• Edge-weights give us
• We can ask more complex questions, like
Markov Process
1Pr t t abX b| X a p
?Pr 2 ba | XX tt
0 1 2 99 100
p p p p
1-p 1-p 1-p 1-p
Start
(10$)
11. 11
• Given that a person’s last cola purchase was Coke,
there is a 90% chance that his next cola purchase will
also be Coke.
• If a person’s last cola purchase was Pepsi, there is
an 80% chance that his next cola purchase will also be
Pepsi.
coke pepsi
0.10.9 0.8
0.2
Markov Process
Coke vs. Pepsi Example
8.02.0
1.09.0
P
transition matrix:
coke pepsi
coke
pepsi
12.
8.02.0
1.09.0
P
12
Given that a person is currently a Pepsi purchaser,
what is the probability that he will purchase Coke two
purchases from now?
Pr[ Pepsi?Coke ] =
Pr[ PepsiCokeCoke ] + Pr[ Pepsi Pepsi Coke ] =
0.2 * 0.9 + 0.8 * 0.2 = 0.34
66.034.0
17.083.0
8.02.0
1.09.0
8.02.0
1.09.02
P
Markov Process
Coke vs. Pepsi Example (cont)
Pepsi ? ? Coke
13. 13
Given that a person is currently a Coke purchaser,
what is the probability that he will buy Pepsi at the
third purchase from now?
Markov Process
Coke vs. Pepsi Example (cont)
562.0438.0
219.0781.0
66.034.0
17.083.0
8.02.0
1.09.03
P
14. 14
•Assume each person makes one cola purchase per week
•Suppose 60% of all people now drink Coke, and 40% drink Pepsi
•What fraction of people will be drinking Coke three weeks from now?
Markov Process
Coke vs. Pepsi Example (cont)
8.02.0
1.09.0
P
562.0438.0
219.0781.03
P
Pr[X3=Coke] = 0.6 * 0.781 + 0.4 * 0.438 = 0.6438
Qi - the distribution in week i
Q0= (0.6,0.4) - initial distribution
Q3= Q0 * P3 =(0.6438,0.3562)
15. 15
Simulation:
Markov Process
Coke vs. Pepsi Example (cont)
week - i
Pr[Xi=Coke]
2/3
3
1
3
2
3
1
3
2
8.02.0
1.09.0
stationary distribution
coke pepsi
0.10.9 0.8
0.2
16. Supervised vs Unsupervised
Decision tree learning is “supervised
learning” as we know the correct output of
each example.
Learning based on Markov chains is
“unsupervised learning” as we don’t know
which is the correct output of “next letter”.
16
17. Implementation Using R
msm (Jackson 2011) :handles Multi-State Models
for panel data;
mcmcR (Geyer and Johnson 2013) implements
Monte Carlo Markov Chain approach;
hmm (Himmelmann and www.linhi.com 2010) fits
hidden Markov models with covariates;
mstate fits Multi-State Models based on Markov
chains for survival analysis (de Wreede, Fiocco,
and Putter 2011).
markovchain
17
18. Implementaion using R
Example1:Weather Prediction:
The Land of Oz is acknowledged not to have ideal
weather conditions at all: the weather is snowy or rainy
very often and, once more, there are never two nice days
in a row. Consider three weather states: rainy, nice and
snowy, Given that today it is a nice day, the corresponding
stochastic row vector is w0 = (0 , 1 , 0) and the forecast
after 1, 2 and 3 days.
Solution: please refer solution.R attached.
18