How to clear GPU memory after using model? - pytorch

I'm trying to free up GPU memory after finishing using the model.
I checked the nvidia-smi before creating and trainning the model: 402MiB / 7973MiB
After creating and training the model, I checked again the GPU memory status with nvidia-smi: 7801MiB / 7973MiB
Now I tried to free up GPU memory with:
del model
torch.cuda.empty_cache()
gc.collect()
and checked again the GPU memory: 2361MiB / 7973MiB
As you can see not all the GPU memory was released (I expected to get 400~MiB / 7973MiB).
I can only relase the GPU memory via terminal (sudo fuser -v /dev/nvidia* and kill pid)
Is there a way to free up the GPU memory after I done using the model ?

This happens becauce pytorch reserves the gpu memory for fast memory allocation. To learn more about it, see pytorch memory management. To solve this issue, you can use the following code:
from numba import cuda
cuda.select_device(your_gpu_id)
cuda.close()
However, this comes with a catch. It closes the GPU completely. So, you can't start training without restarting everything.

Related

Clearing memory when training Machine Learning models with Tensorflow 1.15 on GPU

I am training a pretty intensive ML model using a GPU and what will often happen that if I start training the model, then let it train for a couple of epochs and notice that my changes have not made a significant difference in the loss/accuracy, I will make edits, re-initialize the model and re-start training from epoch 0. In this case, I often get OOM errors.
My guess is that despite me overriding all the model variables something is still taking up space in-memory.
Is there a way to clear the memory of the GPU in Tensorflow 1.15 so that I don't have to keep restarting the kernel each time I want to start training from scratch?
It depends on exactly what GPUs you're using. I'm assuming you're using NVIDIA, but even then depending on the exact GPU there are three ways to do this-
nvidia-smi -r works on TESLA and other modern variants.
nvidia-smi --gpu-reset works on a variety of older GPUs.
Rebooting is the only options for the rest, unfortunately.

Keras multi gpu memory usage is different

I try to use 4 gpus in my labtop. so i use the command
new_model = multi_gpu_model(model, gpus=4)
It works very well, but there are one problem.
only the first gpu has more gpu memory.
below is my code and the state of my gpus memory checked by watch nvidia-smi
The first gpu0 has increase only when make up the model before command new_model = multi_gpu_model(model, gpus=4)
in that time, 4000MiB memory occurs in the first gpu0.
Why this unfaired gpu usage occurs?
and how can i use the 4 gpus equally?
please give me the hint.
thanks a lot.

How to free the consumption of memory from Memory Pool for Training on GPU?

Allocation of Memory using cupy throws an Out of Memory allocation problem.Consumption of memory out of 12 GB gets almost completed before even starting training. Now,during training all the memory is consumed.
Everything is working fine on CPU.
I have tried reducing batch size from 100 to single digit(4).
I have changed the small task into numpy since cupy consumes memory.
I have tried using more than one gpu but encounter same problem
please_refer_this
Please note, It work's on CPU *
Results are fine but I need to trained more for that using cupy and GPU is necessary.

why Tensorflow-gpu is still using cpu

I am using Keras with tensorflow-gpu in backend, I don't have tensorflow (CPU - version) installed, all the outputs show GPU selected but tf is using CPU and system memory
when i run my code the output is: output_code
I even ran device_lib.list_local_device() and the output is: list_local_devices_output
After running the code I tried nvidia-smi to see the usage of gpu and the output is:
nvidia-smi output
Tensorflow-gpu = "1.12.0"
CUDA toolkit = "9.0"
cuDNN = "7.4.1.5"
Environment Variables contain:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\bin;
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\libnvvp;
C:\WINDOWS\system32;
C:\WINDOWS;
C:\WINDOWS\System32\Wbem;
C:\WINDOWS\System32\WindowsPowerShell\v1.0\;
C:\WINDOWS\System32\OpenSSH\;
C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;
D:\Anaconda3;D:\Anaconda3\Library\mingw-w64\bin
D:\Anaconda3\Library\usr\bin;
D:\Anaconda3\Library\bin;
D:\Anaconda3\Scripts;D:\ffmpeg\bin\;
But still when i check for memory usage in task manager the output is
CPU utilization 51%, RAM utilization 86%
GPU utilization 1%, GPU-RAM utilization 0%
Task_manager_Output
So, I think it is still using CPU instead of GPU.
System Configuration:
Windows-10 64 bit; IDE: Liclipse; Python: 3.6.5
It is using the GPU, as you can see in logs.
The problem is, that a lot of things can not be done on the GPU and as long your data is small and your complexity is low, you will end up with low GPU usage.
Maybe the batch_size is to low -> Increase until you run into OOM Errors
Your data loading is consuming a lot of time and your gpu has to wait (IO Reads)
Your RAM is to low and the application uses Disk as a fallback
Preprocsssing is to slow. If you are dealing with image try to compute everything as a generator or on the gpu if possible
You are using some operations, which are not GPU accelerated
Here is some more detailed explanation.

Memory Estimation for Convolution Neural Network in Tensorflow

Hello Everyone,
I am working on a Image classification problem using tensorflow and Convolution Neural Network.
My model is having following layers.
Input image of size 2456x2058
3 convolution Layer {Con1-shape(10,10,1,32); Con2-shape(5,5,32,64); Con3-shape(5,5,64,64)}
3 max pool 2x2 layer
1 fully connected layer.
I have tried using the NVIDIA-SMI tool but it shows me the GPU memory consumption as the model runs.
I would like to know if there is any method or a way to find the estimate of memory before running the model on GPU. So that I can design models with the consideration of available memory.
I have tried using this method for estimation but my calculated memory and observed memory utilisation are no where near to each other.
Thank you all for your time.
As far as I understand, when you open a session with tensorflow-gpu, it allocates all the memory in the GPUS that are available. So, when you look at the nvidia-smi output, you will always see the same amount of used memory, even if it actually uses only a part of it. There are options when opening a session to force tensorflow to allocate only a part of the available memory (see How to prevent tensorflow from allocating the totality of a GPU memory? for instance)
You can control the memory allocation of GPU in TensorFlow. Once you calculated your memory requirements for your Deep learning model you can use tf.GPUOptions.
For example if you want to allocate 4 GB(approximately) of GPU memory out of 8 GB.
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.4
session = tf.Session(config=config, ...)
Once done pass it in tf.Session using config parameter
The per_process_gpu_memory_fraction is used to bound the available amount of GPU memory.
Here's the link to documentation :-
https://www.tensorflow.org/tutorials/using_gpu
NVIDIA-SMI ... shows me the GPU memory consumption as the model run
TF preallocates all available memory when you use it, so NVIDIA-SMI would show nearly 100% memory usage ...
but my calculated memory and observed memory utilisation are no where near to each other.
.. so this is unsurprising.

Resources