Google Colab + Pytorch: RuntimeError: No CUDA GPUs are available - python-3.x

Screenshot of error:
Hello, I am trying to run this Pytorch application, which is a CNN for classifying dog and cat pics.
I am using Google Colab for the GPU, but for some reason, I get RuntimeError: No CUDA GPUs are available. This is weird because I specifically both enabled the GPU in Colab settings, then tested if it was available with torch.cuda.is_available(), which returned true.
The weirdest thing is that this error doesn't appear until about 1.5 minutes after I run the code. You would think that if it couldn't detect the GPU, it would notify me sooner.
I've had no problems using the Colab GPU when running other Pytorch applications using the exact same notebook. I can only imagine it's a problem with this specific code, but the returned error is so bizarre that I had to ask on StackOverflow to make sure.

Try again, this is usually a transient issue when there are no Cuda GPUs available

Recently I had a similar problem, where Cobal print(torch.cuda.is_available()) was True, but print(torch.cuda.is_available()) was False on a specific project. Both of our projects have this code similar to os.environ["CUDA_VISIBLE_DEVICES"]. I can use this code comment and find that the GPU can be used.
-------My English is poor, I use Google Translate

Related

PyTorch is running natively on M1 MacBook, but something isn't working properly

I've just got an M1 MacBook Pro and have been working on getting my development environment set up. I have followed instructions to install PyTorch with mini-forge so that it will run natively.
To test this, I've been running the MNIST example code from here. The code runs fine and Activity Monitor shows me that it is running natively. However, something still seems to be wrong: when the exact same code is executed on my other machine accuracy of 99% is achieved usually after one epoch, and test loss is very low. On the M1 Macbook, although training loss goes down, test loss never drops below 2 and accuracy never exceeds 10% after 14 epochs.
Similar issues occurred when running other code, e.g. this produces a loss of nan at every step.
I haven't been able to find this issue mentioned anywhere else: PyTorch is running natively on the M1, but it just doesn't seem to be working correctly.
I'd appreciate any suggestions as to what might be causing this or how I might fix it.
Thanks!

opencv doesn't use all GPU memory

I'm trying to use the cvlib package which use yolov3 model to recognize objects on images on windows 10.
Let's take an easy example:
import cvlib as cv
import time
from cvlib.object_detection import draw_bbox
inittimer=time.time()
bbox, label, conf = cv.detect_common_objects(img,confidence=0.5,model='yolov3-worker',enable_gpu=True)
print('The process tooks %.3f s'%(time.time()-inittimer)
output_image = draw_bbox(img, bbox, label, conf)
The results give ~60ms.
cvlib use opencv to compute this cnn part.
If now I try to see how much GPU tensorflow used, using subprocess, It tooks only 824MiB.
while the program runs, if I start nvidia-smi it gives me this result:
As u can see there is much more memory available here. My question is simple.. why Cvlib (and so tensorflow) doesn't use all of it to improve the time's detection?
EDIT:
As far as I understand, cvlib use tensorflow but it also use opencv detector. I installed opencv using cmake and Cuda 10.2
I don't understand why but in the nvidia-smi it's written CUDA Version : 11.0 which is not. Maybe that's the part of the problem?
You can verify if opencv is using CUDA or not. This can be done using the following
import cv2
print(cv2.cuda.getCudaEnabledDeviceCount())
This should get you the number of CUDA enabled devices in your machine. You should also check the build information by using the following
import cv2
print cv2.getBuildInformation()
The output for both the above cases can indicate whether your opencv can access GPU or not. In case it doesn't access GPU then you may consider reinstallation.
I got it! The problem come from the fact that I created a new Net object for each itteration.
Here is the related issue on github where you can follow it: https://github.com/opencv/opencv/issues/16348
With a custom function, it now works at ~60 fps. Be aware that cvlib is, maybe, not done for real time computation.
workon opencv_cuda
cd opencv
mkdir build
cd build
cmake -D CMAKE_BUILD_TYPE=RELEASE
and share the result.
It should be something like this

Object detection slow and does not use GPU

I need to use Tensorflow Object Detection API to make some classification connected with recognition.
My problem is that using the API for detection with a pretrained coco model takes too much time and for sure does not use the GPU. I checked my tensorflow-gpu installation on different scripts and it works fine, but when I use this model for detection I can only see increse in CPU usage.
I checked different version of tensorflow (1.12, 1.14), different combinations of CUDA Toolkit (9.0, 10.0) and CuDNN (7.4.2, 7.5.1, 7.6.1) but it is all the same, also tried it on both Windows 7 and Ubuntu 16.04, no difference. My project however requires much faster detection time.
System information:
System: Windows 7, Ubuntu 16.04
Tensorflow: 1.12, 1.14
GPU: GTX 970
Run following python code, if it detects GPU then you can use GPU for training otherwise there is some problem,
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
One more thing, just because your CPU is utilizing does not mean GPU is not at work. CPU always be busy, GPU should also spike when you are training.
Paste the output of above code in the comment if you are not sure about the output.
Edit: After chat with OP on comments, I see the suggested code, and it is using pretrained model, so no training happening here. You are using model and not training a new model. So no gpu is being used.

Image segmentation with edgeTPU

I´m new here, so please be kind and teach me if I did not provide all the information you need :)
I would like to compare Edge TPU with other edge device such as Myriad. I would like to select one object detection model and one image segmentation model. Considering the following link which shows supported operations, I have noticed that yolov3 cannot be compiled for EdgeTPU because it includes LeakyRelu.
https://coral.withgoogle.com/docs/edgetpu/models-intro/
For image segmentation, I'd like to use Deeplab. But I'm still don't know if operations included in deeplab v3+, such as atrous convolution or feature pyramid network, are supported.
I'd appreciate if someone teach me what models are usable on edgeTPU. Are there any models of image segmentation?
Did you already found below?
https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/quantize.md
"mobilenetv2_coco_voc_trainaug_8bit":
deeplabv3_mnv2_pascal_train_aug_8bit/frozen_inference_graph.pb
This model is possible to converting to TFLite FlatBuffer.
And also possible to compile for edgetpu with edgetpu_compiler.
Note. edgetpu_api environment had updated.
You can find it below.
https://coral.withgoogle.com/news/updates-07-2019/
Yes. There are prepackaged segmentation models and code examples how to use them.
Here they are https://coral.ai/models/
Please share if you know where to find something similar for Movidius based VPU devices.
Here you can find all supported layers for edgetpu: https://coral.ai/docs/edgetpu/models-intro/#supported-operations.
And for Conv2D it says "Must use the same dilation in x and y dimensions.". So implementing a version of deeplab v3+ is possible for the edgetpu.

How to debug A graph in tensorflow in windows

I am trying to debug a program that use tensorflow, the problem is I can't launch the dbdp in pydev, How can to check the contain of my tensors or my layers? If any person know how to debug correctly a tensorflow program in windows, please I help me.
Thanks,

Resources