Low accuracy on the test set
I have a dataset with 16 features and 32 class labels, which shows the following behavior:
Neural network classification: high accuracy on train 100%, but low accuracy on the test set 3% (almost like random classification). If I make the network less flexible (reduce the number of neurons or hidden layers), then the train and test accuracy become about 10%.
Gradient boosting tree classification: exactly same behavior. Flexible model results in 100% accuracy over train, but random accuracy on the test. If I reduce the flexibility, the train and test accuracy will be very low around 10%.
What could be the reason? How can I fix this? Any other algorithm I can try?
Here is the distribution of the target data: