Optimally sample from multiple distributions
I have two datasets both of the form from the table below. I am interested in downselecting from dataset A by sampling from the distribution of values from dataset B. However, I want to consider both the Distance and Duration when downselecting such that the distribution of both parameters in my end-product from dataset A matches as best as possible the distribution of these parameters from dataset B.
Anyone have suggestions for tools (preferably in python) that would help me here?
| ID | Distance | Duration |
|---|---|---|
| 1 | 5 | 17 |
| 2 | 9 | 20 |
| 3 | 2 | 100 |
Category Data Science