Very low probability in naive Bayes classifier 1
I have some training data (TRAIN) and some test data (TEST). Each row of each table contains an observed class (X) and some columns of binary (Y). I'm using a Python script that is intended to predict the probability (Pr) of X given Y in the test data based on the training data. It uses a Bernoulli naive Bayes classifier. Here is my script:
https://stackoverflow.com/questions/55187516/look-up-bernoullinb-probability-in-dataframe
It works on the dummy data that is included with the script.
On the real data, I know from experience which class some of the Y columns are indicative of. My script however is giving probability predictions like "1" where I don't think that the class is correct and "6e-77" on correct classes.
Any advice on what I can try please?
Edit
There are two problems. The very low probability is caused by the naive assumption that nothing is related to anything else. This is described here: https://scikit-learn.org/stable/auto_examples/calibration/plot_calibration_curve.html#sphx-glr-auto-examples-calibration-plot-calibration-curve-py
The incorrect answers are caused by my code getting confused about which class is which, as described on my Stack Overflow post.
Topic prediction probability naive-bayes-classifier machine-learning
Category Data Science