Regression dataset with categorical features
I have thought of a regression technique that I want to try on several datasets. I would like these datasets to have the following properties:
- Be a tabular dataset (no images).
- Have at least 20k rows, and ideally around 100k.
- Have some categorical variables with many levels (at least a variable with 100 levels or more).
- Ideally, the target should have long tails.
Does anyone any public dataset with these properties? I have found the stack overflow developer survey to work for me, but I'd like to have some more datasets with such structure.
Topic data regression dataset categorical-data open-source
Category Data Science