If I have got different routes - a series of (lat, lng) points, how to get the similarity of different routes?

It is a real world use case. For example, a route from place A to place B can be different series of lat, lng points - the different trips though they are exactly the same sequence from Street x and then Road y then High way z.

The differences are the locations are reported in different time (for example each trip reports the location 1 minute) and the Vehicles appear in the different lane of the same street.

So, do you have some ideas about how to compute the similarity of two different trips belong to the same route (kinds of mapping different trips to the same route ).

Topic usecase similarity

Category Data Science


I would go about this problem using the following approach:

  1. Create a poly-line for the baseline trip.
  2. Create a buffer zone (closed polygon) around this poly-line using a small radius, say 20 meters.
  3. Using the points in the second trip, calculate the fraction of points that lie outside the buffer zone.

A fraction of zero means that all points from the second trip lie inside the first trip's buffer zone, while a fraction of one means that all lie outside.

You can use this fraction as a measure of dissimilarity, where 1 means that both trips are completely dissimilar and 0 means completely similar (within the given buffer radius). For completeness, you might want to reverse the poly-line roles and calculate the "reverse" similarity. Your final similarity score could then be the product of both.

I usually implement my algorithms in C# using .NET and for these purposes I use Microsoft's System.Spatial NuGet package. Here you can find methods like STBuffer and STContains that will greatly help to implement this.


If I correctly understood your problem: you have a set of routes between the same two start and destination locations, and each route is described as a set of (lat, lon, time) tuples.

In order to compute the distance between two routes, one possibility would be to apply a variant of the edit distance between strings. This algorithm computes the distance between two strings as the amount of string operations (edit, remove, add letters) required to transform the first string into the second one.

You just have to come up with a set of "route operations", and the cost of each of these operations (add point routes, remove point routes, replace one point route by another one, and so on).

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.