Ordering scrambled 1D data sets by continuity
This is a cute little clustering problem that was probably solved a million times over, but I couldn't find a good reference for it.
I have 20 1D datasets with 400 entries each. In the picture they are denoted by different colors.
As you can see, they are also pretty continuous. However, for each index i the datasets have been re-ordered by magnitude, i.e. instead of nice continuous lines, the color now jumps at every intersection of each two datasets.
Is there a way to bring the the datasets to their original order? I.e. cluster the data into 20 continuous lines? This could be easily done by eye.
Thank you very much!
Ofri
P.S. Here's what I tried so far. For each index i I assumed the datasets are already ordered up to i-1. I now prepare 20 bins by extrapolating the ordered datasets from the first i-1 indices to the i index. Now I have 20 values to put in 20 bins, with exactly one value in each bin. I can try 20! combinations and find the one with the minimal error but there must be a more clever/efficient way.
Topic interpolation dataset clustering
Category Data Science