Strange Cuda out of Memory behavior in Pytorch - pytorch

Edit: SOLVED- Problem relied on the number of workers, lowered them, problem solved
I am using a 24GB Titan RTX and I am using it for an image segmentation Unet with Pytorch,
it is always throwing Cuda out of Memory at different batch sizes, plus I have more free memory than it states that I need, and by lowering batch sizes, it INCREASES the memory it tries to allocate which doesn't make any sense.
here is what I tried:
Image size = 448, batch size = 8
"RuntimeError: CUDA error: out of memory"
Image size = 448, batch size = 6
"RuntimeError: CUDA out of memory. Tried to allocate 3.12 GiB (GPU 0; 24.00 GiB total capacity; 2.06 GiB already allocated; 19.66 GiB free; 2.31 GiB reserved in total by PyTorch)"
is says it tried to allocate 3.12GB and I have 19GB free and it throws an error??
Image size = 224, batch size = 8
"RuntimeError: CUDA out of memory. Tried to allocate 28.00 MiB (GPU 0; 24.00 GiB total capacity; 2.78 GiB already allocated; 19.15 GiB free; 2.82 GiB reserved in total by PyTorch)"
Image size = 224, batch size = 6
"RuntimeError: CUDA out of memory. Tried to allocate 344.00 MiB (GPU 0; 24.00 GiB total capacity; 2.30 GiB already allocated; 19.38 GiB free; 2.59 GiB reserved in total by PyTorch)"
reduced batch size but tried to allocate more ???
Image size = 224, batch size = 4
"RuntimeError: CUDA out of memory. Tried to allocate 482.00 MiB (GPU 0; 24.00 GiB total capacity; 2.21 GiB already allocated; 19.48 GiB free; 2.50 GiB reserved in total by PyTorch)"
Image size = 224, batch size = 2
"RuntimeError: CUDA out of memory. Tried to allocate 1.12 GiB (GPU 0; 24.00 GiB total capacity; 1.44 GiB already allocated; 19.88 GiB free; 2.10 GiB reserved in total by PyTorch)"
Image size = 224, batch size = 1
"RuntimeError: CUDA out of memory. Tried to allocate 1.91 GiB (GPU 0; 24.00 GiB total capacity; 894.36 MiB already allocated; 20.94 GiB free; 1.03 GiB reserved in total by PyTorch)"
Even with stupidly low image sizes and batch sizes...

SOLVED- Problem relied on the number of workers, lowered them, problem solved

Related

speechbrain & CUDA out of memory

I am trying to enhance an audio file (3:16 minutes in length, available here) using Speechbrain. If I run the code below (from this tutorial), I get the error OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 39.59 GiB total capacity; 33.60 GiB already allocated; 3.19 MiB free; 38.06 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF.
What is the recommended way to fix the issue? Should I just cut the audio file in pieces?
from speechbrain.pretrained import SepformerSeparation as separator
import torchaudio
model = separator.from_hparams(source="speechbrain/sepformer-wham-enhancement",
savedir='pretrained_models/sepformer-wham-enhancement', run_opts={"device":"cuda"})
est_sources = model.separate_file(path=audio_file)
torchaudio.save("enhanced_wham.wav", est_sources[:, :, 0].detach().cpu(), 8000)

How to force my program to use all the GPUs with less batch size?

Recently I asked a question and answered myself after finding its solution. Please read the question if possible.
To be concise: I have 8 GPUs. I can run only on 7 GPUs with batch size 7.
I need to increase the batch size to 8 to use all the 8 GPUs. But, the program is causing CUDA out of memory error for a batch size of 8.
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0;
10.76 GiB total capacity; 9.71 GiB already allocated; 7.56 MiB free; 9.78 GiB reserved in total by PyTorch)
So, I want to run my program on all the GPUs with batch size 7. Is it possible to do?

Why does PyTorch fail to allocate exactly the amount of free GPU memory?

I am seeing this error pretty randomly. It happens after creating a checkpoint. PyTorch throws a OOM error, while trying to allocate what seems to be a reasonable amount of memory:
RuntimeError: CUDA out of memory. Tried to allocate 8.26 GiB (GPU 2; 14.76 GiB total capacity; 3.50 GiB already allocated; 8.26 GiB free; 5.41 GiB reserved in total by PyTorch)
I wonder what could be going on here.

Pytorch not accessing the GPU's memory

I'm trying to run a reinforcement learning algorithm using pytorch, but it keeps telling me that CUDA is out of memory. However, it seems that pytorch is only accessing a tiny amount of my GPU's memory.
RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 4.00 GiB total capacity; 3.78 MiB already allocated; 0 bytes free; 4.00 MiB reserved in total by PyTorch)
It's not that PyTorch is only accessing a tiny amount of GPU memory, but your PyTorch program accumulatively allocated tensors to the GPU memory, and that 2 MB tensor hits the limitation. Try to use a lower batch size or run the model with half-precision to save the GPU memory.
This command should work for PyTorch to access the GPU memory.
> import os os.environ["CUDA_VISIBLE_DEVICES"] = "1"

cuda out of memory while trying to train a model with Transformer model of spacy

I'm trying to train a custom NER model on top of Spacy transformer model.
For faster training, I'm running it on a GPU with cuda on Google Colab Pro with High-Ram server.
After the first iteration, I get an error:
RuntimeError: CUDA out of memory. Tried to allocate 480.00 MiB (GPU 0;
11.17 GiB total capacity; 9.32 GiB already allocated; 193.31 MiB free; 10.32 GiB reserved in total by PyTorch)
For the above error I tried emptying cache too. But still I'm getting this error. Seems like it’s not freeing up enough space.
import torch
torch.cuda.empty_cache()
Moreover I also tried reducing the batch sizes even to 2. Still the same error occurs.

Resources