train_test_split ValueError: Input contains NaN
I tried to do a stratified sampling by way of train_test_split
in order to save myself some trouble later. So I wrote the following lines:
from sklearn.model_selection import train_test_split
X=data_df
y=data_df.pop('class')
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.125, stratify=y)
I got the error:
ValueError: Input contains NaN
Any help is welcome!
Category Data Science