Confused on Naive Bayes classifier
In the last part of Andrew Ng's lectures about Gaussian Discriminant Analysis and Naive Bayes Classifier, I am confused as to how Andrew Ng derived $(2^n) - 1$ features for Naive Bayes Classifier.
First off, what does he mean by features in the context he was describing? I initially thought that the features were characteristics of our random vector, $x$. I know that for the total possibilities of $x$ it is $2^n$ but I do not understand how he was able to get $(2^n)-1$ and picture it in my head. I generally understand that for Naive Bayes, it's a more simpler way of calculating the conditional probability, but I just want to understand a bit more.
For reference: https://www.youtube.com/watch?v=nt63k3bfXS0 Go to 1:09:00
Topic mathematics bayesian gaussian naive-bayes-classifier machine-learning
Category Data Science