Currency Normalization for Salary Prediction
I have a dataset (350k data points) with data of employees across different regions over the last 10 years. The dataset consists of their skills, the region they are in, the industry, their current role, their salary in the respective currency.
After doing some analysis, I have found 60% of the salaries are in SGD, 30% in INR, and the rest are divided across 15 other currencies.
Is it recommended that I have a model for each currency or is there a way I can convert all the currencies to a universal value so I can use all my data points to train?
Currently, I have used the 40% of the points available in SGD to train a random forest model and I have found that the results on the test set are reasonably accurate. For this model, I have considered skills, role, and industry as features and nothing else. Is there any better model I can explore?
Thank you
Topic deep-learning dataset nlp clustering machine-learning
Category Data Science