How to resolve too many indices for array Index Error

I'm performing a binary classification in Keras and attempting to plot the ROC curves. When I tried to compute the fpr and tpr metrics, I get the "too many indices for array" error. Here is my code:

#declare the number of classes
num_classes=2
#predicted labels
y_pred = model.predict_generator(test_generator, nb_test_samples/batch_size, workers=1)
#true labels
Y_test=test_generator.classes
#print the predicted and true labels
print(y_pred)
print(Y_test)
'''y_pred float32 (624,2) array([[9.99e-01  2.59e-04],
                                 [9.97e-01  2.91e-03],...'''

'''Y_test int32 (624,) array([0,0,0,...,1,1,1],dtype=int32)'''

#reshape the predicted labels and convert type
y_pred = y_pred.argmax(axis=-1)
y_pred = y_pred.astype('int32')

#plot ROC curve
fpr = dict()
tpr = dict()
roc_auc = dict()
for i in range(num_classes):
    fpr[i], tpr[i], _ = roc_curve(Y_test[:,i], y_pred[:, i])
    roc_auc[i] = auc(fpr[i], tpr[i])
fig=plt.figure(figsize=(15,10), dpi=100)
ax = fig.add_subplot(1, 1, 1)
# Major ticks every 0.05, minor ticks every 0.05
major_ticks = np.arange(0.0, 1.0, 0.05)
minor_ticks = np.arange(0.0, 1.0, 0.05)
ax.set_xticks(major_ticks)
ax.set_xticks(minor_ticks, minor=True)
ax.set_yticks(major_ticks)
ax.set_yticks(minor_ticks, minor=True)
ax.grid(which='both')
lw = 1 
plt.plot(fpr[1], tpr[1], color='red',
         lw=lw, label='ROC curve (area = %0.4f)' % roc_auc[1])
plt.plot([0, 1], [0, 1], color='black', lw=lw, linestyle='--')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver operating characteristics')
plt.legend(loc="lower right")
plt.show()

The shape of y-pred and Y_test are:

y_pred float32 (624,2) array([[9.99e-01 2.59e-04], [9.97e-01 2.91e-03],...

Y_test int32 (624,) array([0,0,0,...,1,1,1],dtype=int32)

Topic keras image-classification indexing

Category Data Science


Your code is broken in two places.

The first is because you took the argmax of your class probabilities from y_pred. The line

y_pred = y_pred.argmax(axis=-1)

reshapes your prediction vector into (624,) to match your vector of classes. Thus, when you try to slice your array later with y_pred[:,i] it's going to bark since you no longer have a second dimension. This isn't really the behavior you want either, since the roc_curve function is interested in the exact class probabilities your model produces!

The second is for the same reason, attempting to index the second dimension of a one dimensional numpy array, but for the Y_test vector.

So if you're interested in capturing TPR/FPR for both classes by treating each as the positive class, you need to drop these lines

#reshape the predicted labels and convert type
y_pred = y_pred.argmax(axis=-1)
y_pred = y_pred.astype('int32')

and you need to change the first line of your for loop to:

fpr[i], tpr[i], _ = roc_curve(Y_test, y_pred[:, i])

hope this helps

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.