Feature Encoding for team based sports data

Question

Feature Encoding for team based sports data

Martin Pichler

2019年5月17日 13:45

I am currently playing around with Keras and try to use it with various datasets. Now I have a small datasets of football game results.

date, home_team, away_team, goals_home_team, goals_away_team

Predicting the goals is probably too hard so I combined them into a single feature outcome (win, draw, loss).

date, home_team, away_team, outcome

Using random forest or decision trees, I could simply leave the teams as they are but for a NN I need some encoding. Here is my problem. Which encoding should be used for categorical features that represent a difference in the dataset. Put another way, how would the network know which team is e.g. the winning team. If I use one-hot encoding, how would the network learn that the outcome is related to the home_team? So I thought about using a label encoder, but that would put an order to the teams.

Is there a "best practice" on how to encode this sort of data?

Topic game sports

Category Data Science

Feature Encoding for team based sports data

About