Self-attention model trained with active learning stops learning after a few iterations

I'm doing some active learning with uncertainty sampling on a self-attention model implemented in PyTorch. The algorithm works as follows (steps 3-7 are repeated for 14 iterations):

1. Take 10% of the data as training set, L
2. Train the model on L
3. Either rank the remaining samples U by a certain informativeness measure and pick a batch B of the top n samples, or randomly pick a batch B
4. Add B to L
5. Remove B from U
6. Re-train the model on L union B, after resetting the weights of the model (this is for cumulative training)
7. Evaluate the model on the test set (accuracy)

Now, the model performs well in the first 2-3 iterations and shows to learn on the validation set. However, in later iterations it completely stops learning and shows the same training and validation accuracy and loss. The test accuracy then always drops to the one of a random classifier (around 33%), as you can see here:

The test accuracy should increase instead, as the model is trained on a larger dataset. I really cannot understand why this happens. The code used in each iteration is exactly the same, and I always reset the weights before re-training. This is the code I use to reset the weights:

def reset_weights(m):
    for layer in m.children():
        if hasattr(layer, 'reset_parameters'):
            layer.reset_parameters()

And then:

model.apply(reset_weights)

Can it have something to do with how PyTorch handles the cache? The same procedure works perfectly fine on models implemented with Keras.

Topic pytorch active-learning

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.