Markov Chains for sequential data

Question

I am new to Markov chains and HMM and I am looking for help in developing a program (in python) that predicts the next state based on 20 previous states (lets say 20 states in last 20 months). I have a sequential dataset with 50 customers i.e. the rows contain sequence of 20 states for each of the 50 customers (dataset has 50 rows and 20 columns excluding the headers). I am trying to determine the next state using markov chains and all the literature in the web is focused around examples of text strings. I am looking something specific to the kind of example I have. Can somebody please help me come up with the initial probability matrix and then consider the 20 states to predict the next state?

score 3 · Accepted Answer · answered Mar 28 '18 at 15:49

If you know what the state history is, you don't need a 'hidden' Markov model, you just need a Markov model (or some other mechanism). The 'hidden' part implies a distinction between some sequence of unobservable states, and some observations that are related to them. In your case, you say you have observed the past states for each customer, so you don't necessarily need to infer anything 'hidden'.

The simplest way to proceed in your case would be to calculate a transition matrix, i.e. probability of state given previous state. That's a very simple model but it might do what you want. To do this, just look at all state pairs, and count to get p(s2 | s1) = p(s1 & s2)/p(s1). This is equivalent to a 1-gram model that you've probably read about. Each state is akin to a word.

You could also make a more complex model, like a 2-gram model or even an RNN. Honestly, since you have a fixed amount of history, you can just throw your data into an scikit-learn model or xgboost or something, where each customer's history is the vector of predictors and the next state is the outcome. It won't know the sequential dependencies, but you are essentially indexing the past states by time, so it may work pretty well.

If you need more clarification about part of this, just ask.

score 3 · Answer 2 · answered Mar 28 '18 at 18:29

Because there is very little data HMM will probably overfit (depends on the number of states and letters). I would go with a simple markov chain as it has less parameters and you dont need to tune things like hidden states. If you are going with HMM I would recommend a package called Pomegranate.

I would also recommend try and using Multinomial model, which can be viewed as 0-memory markov model, maybe your data doesnt have past dependencies.

Arpit Sisodia · Answer 3 · 2018-06-19T05:22:12.440

1) You can use HMM , SSM or UCM if u have assumption that transition is happening from some hidden state. Considering 20 data points, I wonder how model fitting will happen.

2) markov chain would not follow assumption for hidden states. But this will give you robust bayesian based probabilistic model to forecast future states.

3) this article might help- https://www.datacamp.com/community/tutorials/markov-chains-python-tutorial

To know the basics of HMM and hidden states.-

https://machinelearningstories.blogspot.com/2017/02/hidden-markov-model-session-1.html

Markov Chains for sequential data

3 Answers3