Fisher's Iris data set with Caffe
I am trying to use Caffe on the usual Fisher's Iris data set (150 flowers, each having 4 features, and split into 3 classes):
- if a flower belong to class 1 (setosa), the network output should be [1, 0, 0]
- if a flower belong to class 2 (versicolor), the network output should be [0, 1, 0]
- if a flower belong to class 3 (virginica), the network output should be [0, 0, 1]
I use the SigmoidCrossEntropyLoss as it used for predicting K independent probability values in [0,1]. Let ip3 be the layer connected to SigmoidCrossEntropyLoss (in addition to the label layer): looking at ip3's output, I interpret negative values as class absent (0), and positive values as class present (1). E.g. if ip3's output is [-0.3; 0.9; -0.4], then I interpret it as [0; 1; 0], i.e. class 2 is present but classes 1 and 3 are absent.
The network does a good job classifying classes 1 and 3 (accuracy over 90%), but consistently fails to predict class 2: the network always predicts that class 2 is absent, i.e. the network's second output is always 0, i.e. that ip3's second output is always negative, no matter what the input is.
Here is the architecture I use. Am I doing something wrong? Is the architecture not suited for the task?
Topic caffe multiclass-classification deep-learning
Category Data Science