How to use regularizer in AllenNLP?

Apology if this sounds a bit lame.

I am trying to use Allennlp for my NLP tasks and would like to use regularization to reduce overfitting. However from all the online tutorials, all the regularizers are set as None, and I still couldn't find out how to use the regularizer after many many attempts.

If I use the example in the official tutorial (https://github.com/titipata/allennlp-tutorial) , what if I want to add in regularizer for LSTM and feedforward layer?

class AcademicPaperClassifier(Model):
    """
    Model to classify venue based on input title and abstract
    """
    def __init__(self, 
                 vocab: Vocabulary,
                 text_field_embedder: TextFieldEmbedder,
                 title_encoder: Seq2VecEncoder,
                 abstract_encoder: Seq2VecEncoder,
                 classifier_feedforward: FeedForward,
                 initializer: InitializerApplicator = InitializerApplicator(),
                 regularizer: Optional[RegularizerApplicator] = None) - None:
        super(AcademicPaperClassifier, self).__init__(vocab, regularizer)
        self.text_field_embedder = text_field_embedder
        self.num_classes = self.vocab.get_vocab_size("labels")
        self.title_encoder = title_encoder
        self.abstract_encoder = abstract_encoder
        self.classifier_feedforward = classifier_feedforward
        self.metrics = {
                "accuracy": CategoricalAccuracy(),
                "accuracy3": CategoricalAccuracy(top_k=3)
        }
        self.loss = torch.nn.CrossEntropyLoss()
        initializer(self)

    def forward(self, 
                title: Dict[str, torch.LongTensor],
                abstract: Dict[str, torch.LongTensor],
                label: torch.LongTensor = None) - Dict[str, torch.Tensor]:

        embedded_title = self.text_field_embedder(title)
        title_mask = get_text_field_mask(title)
        encoded_title = self.title_encoder(embedded_title, title_mask)

        embedded_abstract = self.text_field_embedder(abstract)
        abstract_mask = get_text_field_mask(abstract)
        encoded_abstract = self.abstract_encoder(embedded_abstract, abstract_mask)

        logits = self.classifier_feedforward(torch.cat([encoded_title, encoded_abstract], dim=-1))
        class_probabilities = F.softmax(logits, dim=-1)
        argmax_indices = np.argmax(class_probabilities.cpu().data.numpy(), axis=-1)
        labels = [self.vocab.get_token_from_index(x, namespace="labels") for x in argmax_indices]
        output_dict = {
            'logits': logits, 
            'class_probabilities': class_probabilities,
            'predicted_label': labels
        }
        if label is not None:
            loss = self.loss(logits, label)
            for metric in self.metrics.values():
                metric(logits, label)
            output_dict["loss"] = loss

        return output_dict

Topic allennlp deep-learning nlp

Category Data Science


The API is a bit confusing. You pass a RegularizerApplicator instance to the model, which takes a list of tuples of the form (regex, regularizer). The regex matches against your model's parameter's name. For example, if you had layer called linear_relu_stack.0.bias, linear_relu_stack.0.weight, you could apply a single regularizer to both with the regex "^linear_relu_stack.0.(bias|weight)$".

The regularizer itself is an just instance of L1Regularizer or L2Regularizer, where you can specify the alpha.

In all, your model would look something like this after adding regularization:

from allennlp.nn.regularizers.regularizer_applicator import RegularizerApplicator
from allennlp.nn.regularizers.regularizers import L2Regularizer

AcademicPaperClassifier(*model_args, regularizer=RegularizerApplicator([
    (".*LSTM.*", L2Regularizer(alpha=0.01)),
    (".*FFN.*",  L2Regularizer(alpha=0.01)),
]))

The API documentation is the best resource for this: http://docs.allennlp.org/v0.9.0/api/allennlp.nn.regularizers.html#allennlp.nn.regularizers.regularizer_applicator.RegularizerApplicator.from_params

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.