How is GPU still used while cuda out-of-memory error occurs?

I am using Tensorflow to perform inference on a dataset on Ubuntu. While it reports a cuda out-of-memory error, the nvidia-smi tool still shows that GPU is used, as shown below:

My code is predicting one example at a time, so no batch used. I am using GPU 0 so the the first 47% is the one my code is using. The error message is below:

INFO:tensorflow:Restoring parameters from /plu/../../model-files/model.ckpt-2683000
2021-09-09 07:49:24.230623: I tensorflow/stream_executor/cuda/cuda_driver.cc:831] failed to allocate 15.75G (16914055168 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2021-09-09 07:49:31.674556: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

My machine has a lot of memory, as shown below:

free -hm
              total        used        free      shared  buff/cache   available
Mem:           125G         16G        8.3G        1.1G        100G        107G
Swap:            0B          0B          0B

I have 2 questions:

Why is gpu still being used normally while a cuda out of memory error occurs? It seems my machine has a lot of memory. Does it mean those 107G memory is not used but all only cuda memory (16G) is used and that caused the out of memory error?

Topic cuda tensorflow

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.