Where do the "semantics" of a Bayesian network come from?
On Bayesian Networks, Ghahramani (2001) says:
A node is independent of its non-descendants given its parents.
This point is fundamental enough that Ghahramani calls it the “semantics” of a Bayesian network. It is certainly useful, and it is simple enough to prove using d-separation. But his characterization suggests that the property should be even more primitive than something provable by d-separation.
Overall, I feel that I am missing something. Is there a more primitive way to verify the statement than to use d-separation? Why does Gharamani equate that fact specifically to the semantics of a Bayesian network, rather than equating the semantics to the overall conditional independencies in the network (given by d-separation)? And if the statement is a consequence of d-separation, why focus on this fact specifically rather than the (arguably equally useful) fact about Markov blankets?
Reference: Ghahramani, Z. (2001). An introduction to hidden Markov models and Bayesian networks. In Hidden Markov models: applications in computer vision (pp. 9-41).
Topic graphical-model bayesian-networks machine-learning
Category Data Science