How to provide Intentional Bias towards recent examples in Text Classification?

I have trained an XGBClassifier to classify text issues to a rightful assignee (simple 50-way classification). The source from where I am fetching the data also provides a datetime object which gives us the timestamp at which the issue was created.

Logically, the person who has recently worked on an issue (say 2 weeks ago) should be a better suggestion instead of (another) person who has worked on similar issue 2 years ago.

That is, if there two examples from training set, one recent, one old, How can I introduce an intentional recency bias so that model classifies / suggests the label from recent example?

Topic bias text-classification xgboost preprocessing classification

Category Data Science


Following your hypothesis, you could add simply a feature which represents how recently the assignee worked on another issue, for example in number of days since last issue. Normally by adding this feature in the training data the model will take it into account, and if the hypothesis is satisfied most of the time it should learn that this value should be low.

Generally 50-way classification is hard, a random baseline would get only 2% accuracy. Note that you could consider different options for the design, for example ranking the developers by their compatibility with the task. In this case you you could also favor developers would worked recently on an issue, take into account the similarity of the issue with their past ones.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.