Split npz Dataset into Train/Test using Sklearn

Question

Split npz Dataset into Train/Test using Sklearn

Mohammed Abed

2021年3月27日 17:44

I have a dataset of faces stored in an NPZ file that I would like to train it on Siamese Network. To do that, the dataset must be split into train / test using Sklearn. However, when I run the code to do the split I face this error message:

ValueError: Found input variables with inconsistent numbers of samples: [2, 199139]

How can I solve this issue given that my dataset consist of 199139 labeled faces, so I have a face and a label that needs to be split to be ready for training. Initially this is the code I use for split:

from sklearn.model_selection import train_test_split

# Split into training and testing data
(train_images, train_labels), (test_images, test_labels) = train_test_split(data, labels2, test_size=0.20, random_state=42)

Thank you in advance !!

Topic siamese-networks cnn image-recognition deep-learning

Category Data Science

Split npz Dataset into Train/Test using Sklearn

About