Test independence based on Kernel Density Estimation

I am working on a problem where I have a dataset of $X$ is dataset with $(X, Y, T, K)$ four attributes, I'd like to test if $P(X, Y, T)P(K) = P(X, Y, T, K)$, that is if $X, Y, T$ is independent of $K$. I have two questions:

  1. Is it possible to use kernel estimation to fit $P(X, Y, T)$ $P(X, Y, T, K)$ and $P(K)$ respectively and test the independence? If I did so, will the output be accurate? (My dataset is very large with tens of millions of records. So it probably won't underfit.)
  2. If not, what I can do to test it?

To convert to a Hypothesis Testing setting: $$H0: P(X, Y, T)P(K) = P(X, Y, T, K)$$ $$H1: P(X, Y, T)P(K) \neq P(X, Y, T, K)$$ I don't know how to test this.

Any help or insight appreciated!!

Topic density-estimation bigdata

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.