I have a CNN with 2 hidden layers. When i use keras on cpu with 8GB RAM, sometimes i have "Memory Error" or sometimes precision class was 0 but some classes at the same time were 1.00. If i use keras on GPU,will it solve my problem?
You probably don't have enough memory to fit all the images in the CPU during training. Using a GPU will only help if it has more memory. If this is happening because you have too many images or they're resolution is too high, you can try using keras' ImageDataGenerator and any of the flow methods to feed your data in batches.
Related
Is there a way that I can manage how much memory to reserve for Pytorch? I've trying to finetune a BERT model from Huggingface and I'm already using a A100 Nvidia GPU with 80GiB of GPU from Paperspace and still managed to go "CUDA out of memory". I'm not training from scratch or anything and my dataset is composed by less than 250,000 datum. I don't know if I'm missing something or if there is a way to reduce the allocated memory for Pytorch.
(Seriously, I just wanna cry now.)
Error specs
I've tried reducing my batch size at training dataset and eval dataset, also turning fp16=Tue. I even tried with the gradient accumulation but nothing seems to work. I've been cleaning with gc.collect() and torch.cuda.empty_cache() and restarting my kernel machine. I even tried with a smaller model.
I am training a GAN and I see that my performance is very different on my CPU and GPU. I noticed that on the GPU installation, it is 2.0.8 and for CPU it is 2.1.5. On a separate machine with keras+tf GPU I get the same performance as the CPU one from before, the keras version is 2.1.6.
Is this expected? In the keras release notes I did not find anything that would change the way my training works.
The performance with the new one is better in many senses. Much faster convergence (10x less epochs required) but the images are less smooth sometimes.
Since we have object detection models based on CNN such as Fast RCNN, Faster RCNN, YOLO (You only look once), ssd (single shot detector).
I have tried running Faster RCNN using CAFFE but backward path is not implemented for CPU mode. Is there any CNN based model which I can use it to train using CPU.
Any help will be appreciated.
faster-rcnn layers in CPU: https://github.com/neuleaf/faster-rcnn-cpu
SSD's original implementation already supports CPU training.
I would like to know for a given training setup (model, optimizer, batch size) how much memory my network needs. Can Keras automatically give me this number?
Also interesting: How much memory is needed during inference?
I am training a CNN with 1M images with theano. Now I am puzzled on how to prepare the training data.
My questions are:
When the images resize to 64*64*3, the size of whole data is about 100G. Should I save the data into a single npy file or some smaller files? which one is efficient?
How to decide the number of parameters of the CNN? How about 1M/10 = 100K?
Should I limit the memory cost of a training block and the CNN parameters less than GPU memory?
My computer is with 16G memory and GPU Titian.
Thank you very much.
If you're using a NN framework like pylearn2, lasagne, Keras, etc, check the docs to see if there are guidelines for iterating batches off disk from an hdf5 store or similar.
If there's nothing and you don't want to roll your own, the fuel package provides lots of helpful data iteration schemes that can be adapted to models in theano (and probably most of the frameworks; there's a good tutorial in the fuel repository).
As for the parameters, you'll have to cross validate to figure out the best parameters for your data.
And yes, the model size + minibatch size + dropout mask for the batch has to be under the available vram.