Is there any way to collect categorical features quickly in Julia DataFrames?
I'm using Julia 0.6.3 with Dataframes.jl
I was wondering if there was any way to get categorial features easily in Julia?
For large datasets it can be impossible to enter everything by hand.
My workaround is to rely on strings and usually low cardinality but it's not fool-proof.
My workaround so far :
cat_cols = []
for col in cols
if contains(string(typeof(X_train[col])),"String") == true
push!(cat_cols,col)
end
end
But it seems kind of ugly and I don't catch label encoded values because they are integers.
I could also try to rely on low unique counts but then sparse features would be taken in aswell.
Any idea? Thanks!
Topic dataframe julia categorical-data
Category Data Science