Re-sampling of a Histograms Bins
I would like to be able to resample a histograms bins without having access tot he raw data. And just to be clear, by resample, I mean to change the number of bins and still provide a good estimate of the original probabilities of those bins.
I can think of many ways to do this, but having trouble figuring out which is the best method which maintains the same probability in the resulting histogram. The easy one would be if the input histogram X had x bins and the desired result histogram Y has y bins where x = y. This is a simple 1 - 1 sampling of the original bins. The problem forms to me as I decrease y lower than x or increase it above x.
For example: If x = 10 bins and y = 20 bins, it seems like you could simply double each of x's bins so you have Y = { x1, x1, x2, x2, x3, x3, ..., x10, x10 } but this seems like a naive approach as it seems like the 2nd copy of the previous bin should be influenced also by the next bin value y2 = (x1 + x2)/2 for example.
If x = 20 bins and y = 7 bins I can see it isn't fair to simply sample a value based upon a linear interpolation between data point as there might be 3 or 4 points on either side of the sample that should be a part of the probability for the resampled data.
I would also like to consider the possibility that the histogram is contained on the ends, so in the case of measuring water temperature below freezing isn't a likely temp for water nor above boiling for the standard cases. I would like to be able to consider the probability beyond one or both extreme bins to be 0.
Is there a standard algorithm which can be coded in C++/C# or something in pseudo code that I can convert to code for the above re-sampling / re-sizing?
Topic historgram probability
Category Data Science