Estimating location in a model

Question

Estimating location in a model

principe

2022年5月3日 10:05

I have a big dataset with 10 columns and about a 100,000 rows. Each 5 rows represent a person being tracked and the data related to this tracking such as time, velocity, etc. the last two columns are the longitude and latitude for that person.

To test the model, the test set has the fifth row for each person missing in longitude and latitude. What's the best way to approach this problem?

for example the test set looks like:

id   time    feature2  feature3  long    lat
1      x          x        x     number  number
1      x          x        x     number  number
1      x          x        x     number  number
1      x          x        x     number  number
1      x          x        x     
2      x          x        x     number  number
2      x          x        x     number  number
2      x          x        x     number  number
2      x          x        x     number  number
2      x          x        x

etc

Topic machine-learning-model predictive-modeling algorithms machine-learning

Category Data Science

Brian Spiering · Accepted Answer · 2020年11月22日 15:52

One option would be to cluster the longitude and latitude. Point estimates based on longitude and latitude would be wrong much of the time. Clustering would lower the precision of the data to increase the chance of the model being approximately correct.

Longitude and latitude can be clustering using spatially-aware indexing such as H3. Spatially-aware indexing allows for different size bins.

Estimating location in a model

About