House price inflation modelling

I have a data set of house prices and their corresponding features (rooms, meter squared, etc). An additional feature is the sold date of the house. The aim is to create a model that can estimate the price of a house as if it was sold today. For example a house with a specific set of features (5 rooms, 100 meters squared) and today's date (28-1-2020), what would it sell for? Time is an important component, because prices increase (inflate over time). I am struggling to find a way to incorporate the sold date as a feature in the gradient boosting model.

I think there are a number of approaches:

  1. Convert the data into an integer, and include it directly in the model as a feature.
  2. Create a separate model for modelling the house price development over time. Let's think of this as some kind of an AR(1) model. I could then adjust all observations for inflation, so that we would get an inflation adjusted price for today. These inflation adjusted prices would be trained on the feature set.

What are your thoughts on these two options? Are there any alternative methods?

Topic natural-gradient-boosting machine-learning

Category Data Science


The two most common ways to model inflation is indirectly and directly.

Inflation can be modeled by adding time as feature to the model. The most useful way to encode time is as a relative month. The first month in the dataset could be 1, the second month could be 2, …. Then the model could capture how month as increases it influences price.

Inflation could be model directly. The model could predict price for specific time, then take the model's estimated price and adjust it for today's dollar value by multiplying it by a looked-up inflation amount.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.