mBART training "CUDA out of memory"

I want to train a network with mBART model in google colab , but I got the message of

RuntimeError: CUDA out of memory. Tried to allocate 886.00 MiB (GPU 0; 15.90 GiB total capacity; 13.32 GiB already allocated; 809.75 MiB free; 14.30 GiB reserved in total by PyTorch)

I subscribed with GPU in colab. I tried to use 128 or 64 for The maximum total input sequence length.

Kindly, What can I do to fix the problem?

Topic cuda transformer colab gpu

Category Data Science


mBART available in huggingface transformers is not in its base architecture. This means you will not be able to fit this transformer into single GPU offered by Colab.

To deal with GPU OOM problem, you can refer to the PyTorch documentation at https://pytorch.org/docs/stable/checkpoint.html.

Refer to this answer for detailed information.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.