Fisher's Iris data set with Caffe

I am trying to use Caffe on the usual Fisher's Iris data set (150 flowers, each having 4 features, and split into 3 classes):

  • if a flower belong to class 1 (setosa), the network output should be [1, 0, 0]
  • if a flower belong to class 2 (versicolor), the network output should be [0, 1, 0]
  • if a flower belong to class 3 (virginica), the network output should be [0, 0, 1]

I use the SigmoidCrossEntropyLoss as it used for predicting K independent probability values in [0,1]. Let ip3 be the layer connected to SigmoidCrossEntropyLoss (in addition to the label layer): looking at ip3's output, I interpret negative values as class absent (0), and positive values as class present (1). E.g. if ip3's output is [-0.3; 0.9; -0.4], then I interpret it as [0; 1; 0], i.e. class 2 is present but classes 1 and 3 are absent.

The network does a good job classifying classes 1 and 3 (accuracy over 90%), but consistently fails to predict class 2: the network always predicts that class 2 is absent, i.e. the network's second output is always 0, i.e. that ip3's second output is always negative, no matter what the input is.

Here is the architecture I use. Am I doing something wrong? Is the architecture not suited for the task?

Topic caffe multiclass-classification deep-learning

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.