Markov Chain vs Bayes Net

Question

Markov Chain vs Bayes Net

JANVI SHARMA

2021年5月16日 19:46

I am learning about Markov Chain and Bayesian Nets. However at this point I am a bit confused about what types of problems are modelled with the two different models presented to us. From what I understand (mostly from the examples I have read) Markov Chains are being used to represent the change in a single type of variable over time. So for example a random variable X representing the weather. Let X = {sun, rain}. Then for a markov chain, at time = 0 we are given P(X) and a transition model $P(X_t/X_{t-1})$. So with this knowledge we could calculate $P(X_\infty)$. Like asking a question, given the initial distribution of a random variable X and a transition model, what would be the value of P(X=x) at time t? The solution of such a question can be answered by mini forward algorithm.
Now for bayesian network, from what I understand, we model dependencies among different random variables. So here essentially we have some random variables that may have a causation relationship with other variables. There are some nice properties about such networks that let us define the joint over all variables easily.
Onto my question - Often the topic of markov chains is introduced before bayesian networks. What is the relationship between the two? because I can't seem to draw parallels between them, to me I see both as quite different approaches at modelling quite different problems.
In what other contexts can markov chains be used? or are they always used to model a single variable varying over time steps? I hope to gain some clarity to distinguish between the two and hopefully this would help me understand the topics better! Any suggestions/readings/links are much appreciated!

Topic bayesian-networks markov-process

Category Data Science

Kostya · Accepted Answer · 2021年5月16日 19:46

First of all, I'd disagree that Markov Chains are dealing with a "single type of variable". If you look at the formal definition of a Markov Chain, you'll see that variables $X$ are random variables. And random variables are defined over arbitrary (well, measurable) sets of possible outcomes. So your $X$ can not only be from $\{sun,rain\}$ set, but if could be from a Cartesian product of $\{sun,rain\}$ and $\{windy,cloudy,calm\}$ and temperature from $[-60,60]$ interval.

About the relation between Markov Chains and Bayes Nets, I'd say that there is a common framework that lets you understand relationship between those (and, in fact, many other probabilistic structures). In all cases we have a collection of random variables - for Markov Chains these are $X_0,X_1,X_2\dots$ And for Bayes Nets lets call them $A,B,C,D,E,F$.

In both cases we are interested in the probability distributions over the whole collection of variables, called joint probability distribution:

$$P(X_0,X_1,X_2\dots)\qquad\text{and} \qquad P(A,B,C,D,E,F)$$

The problem is that trying to represent this distribution (say on a computer) is impossible for any reasonably complex problem - the number of possible combinations of variables grows exponentially with the number of variables. So we are coming up with a way to factorize these joints into smaller, manageable multipliers.

For the Markov Chains, the factorization property is $$P(X_0,X_1,X_2\dots) = P(X_1|X_0)P(X_2|X_1)P(X_3|X_2)\cdots$$

For the Bayes Nets, the factorization property could look like this (for a particular dependency graph that I just made up):

$$P(A,B,C,D,E,F) = P_a(A)P_b(B)P_c(C|A)P_d(D|A,B)P_e(E|C,D)P_f(F|E)$$

In my opinion, considering the joint distribution and then seeing what factorization structure is imposed on it by your framework is a good place to start studying it.