Keras: How to normalize dataframe with continuous and categorical data?
I have a dataframe with about 50 columns. The columns are either categorical or continuous data. The continuous data can be between 0.000001-1.00000 or they can be between 500,000-5,000,000. The categorical data is usually a name, for example a store name.
How can I normalize this data so that I can feed it into a dense layer of a Sequential model?
The Y values are either 0 or 1, so it is a binary classification problem. I am currently normalizing all of the continuous data to be 0-1 and one-hot encoding all of the categorical data, so that if I have a column with 5 names it in, I will get a matrix with 5 columns filled with 0's and 1's. Then I join all of the continuous and categorical data and feed it into a Dense layer with init='uniform'
and activation='relu'
.
Is this the standard way of doing things?
Topic keras tensorflow theano deep-learning neural-network
Category Data Science