speechbrain & CUDA out of memory - audio

I am trying to enhance an audio file (3:16 minutes in length, available here) using Speechbrain. If I run the code below (from this tutorial), I get the error OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 39.59 GiB total capacity; 33.60 GiB already allocated; 3.19 MiB free; 38.06 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF.
What is the recommended way to fix the issue? Should I just cut the audio file in pieces?
from speechbrain.pretrained import SepformerSeparation as separator
import torchaudio
model = separator.from_hparams(source="speechbrain/sepformer-wham-enhancement",
savedir='pretrained_models/sepformer-wham-enhancement', run_opts={"device":"cuda"})
est_sources = model.separate_file(path=audio_file)
torchaudio.save("enhanced_wham.wav", est_sources[:, :, 0].detach().cpu(), 8000)

Related

How to force my program to use all the GPUs with less batch size?

Recently I asked a question and answered myself after finding its solution. Please read the question if possible.
To be concise: I have 8 GPUs. I can run only on 7 GPUs with batch size 7.
I need to increase the batch size to 8 to use all the 8 GPUs. But, the program is causing CUDA out of memory error for a batch size of 8.
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0;
10.76 GiB total capacity; 9.71 GiB already allocated; 7.56 MiB free; 9.78 GiB reserved in total by PyTorch)
So, I want to run my program on all the GPUs with batch size 7. Is it possible to do?

Why does PyTorch fail to allocate exactly the amount of free GPU memory?

I am seeing this error pretty randomly. It happens after creating a checkpoint. PyTorch throws a OOM error, while trying to allocate what seems to be a reasonable amount of memory:
RuntimeError: CUDA out of memory. Tried to allocate 8.26 GiB (GPU 2; 14.76 GiB total capacity; 3.50 GiB already allocated; 8.26 GiB free; 5.41 GiB reserved in total by PyTorch)
I wonder what could be going on here.

Detectron2 Segmentation training : out of memory while training the Detectron2 mask-rcnn model on GPU

I tried almost all the option to train the model including reducing batch size to 1 and some other steps as described here
How do I select which GPU to run a job on?,
But still i get the error
RuntimeError: CUDA out of memory. Tried to allocate 238.00 MiB (GPU 3; 15.90 GiB total capacity; 15.20 GiB already allocated; 1.88 MiB free; 9.25 MiB cached)
This is the notebook , configured in Azure ML workspace with N24-GPU
thank you
Check your memory usage before you start training, sometimes detectron2 doesn't free vram after use, particularly if training crashes. If this is the case, the easiest way to fix the issue in the short term is a reboot.
As for a long term fix to this issue, I cant give any advise other than ensuring your using the latest version of everything.

Pytorch not accessing the GPU's memory

I'm trying to run a reinforcement learning algorithm using pytorch, but it keeps telling me that CUDA is out of memory. However, it seems that pytorch is only accessing a tiny amount of my GPU's memory.
RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 4.00 GiB total capacity; 3.78 MiB already allocated; 0 bytes free; 4.00 MiB reserved in total by PyTorch)
It's not that PyTorch is only accessing a tiny amount of GPU memory, but your PyTorch program accumulatively allocated tensors to the GPU memory, and that 2 MB tensor hits the limitation. Try to use a lower batch size or run the model with half-precision to save the GPU memory.
This command should work for PyTorch to access the GPU memory.
> import os os.environ["CUDA_VISIBLE_DEVICES"] = "1"

Strange Cuda out of Memory behavior in Pytorch

Edit: SOLVED- Problem relied on the number of workers, lowered them, problem solved
I am using a 24GB Titan RTX and I am using it for an image segmentation Unet with Pytorch,
it is always throwing Cuda out of Memory at different batch sizes, plus I have more free memory than it states that I need, and by lowering batch sizes, it INCREASES the memory it tries to allocate which doesn't make any sense.
here is what I tried:
Image size = 448, batch size = 8
"RuntimeError: CUDA error: out of memory"
Image size = 448, batch size = 6
"RuntimeError: CUDA out of memory. Tried to allocate 3.12 GiB (GPU 0; 24.00 GiB total capacity; 2.06 GiB already allocated; 19.66 GiB free; 2.31 GiB reserved in total by PyTorch)"
is says it tried to allocate 3.12GB and I have 19GB free and it throws an error??
Image size = 224, batch size = 8
"RuntimeError: CUDA out of memory. Tried to allocate 28.00 MiB (GPU 0; 24.00 GiB total capacity; 2.78 GiB already allocated; 19.15 GiB free; 2.82 GiB reserved in total by PyTorch)"
Image size = 224, batch size = 6
"RuntimeError: CUDA out of memory. Tried to allocate 344.00 MiB (GPU 0; 24.00 GiB total capacity; 2.30 GiB already allocated; 19.38 GiB free; 2.59 GiB reserved in total by PyTorch)"
reduced batch size but tried to allocate more ???
Image size = 224, batch size = 4
"RuntimeError: CUDA out of memory. Tried to allocate 482.00 MiB (GPU 0; 24.00 GiB total capacity; 2.21 GiB already allocated; 19.48 GiB free; 2.50 GiB reserved in total by PyTorch)"
Image size = 224, batch size = 2
"RuntimeError: CUDA out of memory. Tried to allocate 1.12 GiB (GPU 0; 24.00 GiB total capacity; 1.44 GiB already allocated; 19.88 GiB free; 2.10 GiB reserved in total by PyTorch)"
Image size = 224, batch size = 1
"RuntimeError: CUDA out of memory. Tried to allocate 1.91 GiB (GPU 0; 24.00 GiB total capacity; 894.36 MiB already allocated; 20.94 GiB free; 1.03 GiB reserved in total by PyTorch)"
Even with stupidly low image sizes and batch sizes...
SOLVED- Problem relied on the number of workers, lowered them, problem solved

Resources