Problem with binning
I am trying to change continuous data points to categorical by using binning. I know two techniques, i) equal width bins ii) bins with equal number of elements. My questions are:
- Which type of binning is appropriate for which kind of problem?
- I use
pandas
for my data analysis task and it haspd.cut
method for arbitrary binning which I use for equal wdith bins andpd.qcut
method for bins with equal number of elements. The second function always produces very complicated bin boundaries (like, [(-28.004,795.8976],(795.8976,900.342]]). Is there any way to control the bin boundaries so that they look more meaningful to non-technical persons?
Thanks in advance.
Topic feature-engineering numerical data categorical-data
Category Data Science