Sampling trying to keep as much multivariate variance as possible
I was thinking if anyone considered a sampling technique that would try to aim keeping as much of the variance as possible (e.g. as many unique values, or very widely distributed continuous variables).
The benefit might be that it will allow development of code around the sample, and really work with the edge cases in the data.
You can then later always take a representative sample.
So, I am wondering if people have tried to sample for maximum variance before and if there is a clever way to sample with as high possible variance (of course an approximation is just fine).
Topic multivariate-distribution variance sampling
Category Data Science