PyTorch DirectML WSL no operator read_file - pytorch

I'm trying to use pytorch-directml on a WSL to perform some basic image reading and build a CNN.
However, I'm seeing this error when I try to call the torchvision.io.image_read function to read an image.
The versions I'm on are:
Python: 3.8
Pytorch-directml: 1.8.0a0.dev220506
Torchvision: 0.9.0
Can anyone help with this? Would be much appreciated!

Related

Google Colab + Pytorch: RuntimeError: No CUDA GPUs are available

Screenshot of error:
Hello, I am trying to run this Pytorch application, which is a CNN for classifying dog and cat pics.
I am using Google Colab for the GPU, but for some reason, I get RuntimeError: No CUDA GPUs are available. This is weird because I specifically both enabled the GPU in Colab settings, then tested if it was available with torch.cuda.is_available(), which returned true.
The weirdest thing is that this error doesn't appear until about 1.5 minutes after I run the code. You would think that if it couldn't detect the GPU, it would notify me sooner.
I've had no problems using the Colab GPU when running other Pytorch applications using the exact same notebook. I can only imagine it's a problem with this specific code, but the returned error is so bizarre that I had to ask on StackOverflow to make sure.
Try again, this is usually a transient issue when there are no Cuda GPUs available
Recently I had a similar problem, where Cobal print(torch.cuda.is_available()) was True, but print(torch.cuda.is_available()) was False on a specific project. Both of our projects have this code similar to os.environ["CUDA_VISIBLE_DEVICES"]. I can use this code comment and find that the GPU can be used.
-------My English is poor, I use Google Translate

opencv doesn't use all GPU memory

I'm trying to use the cvlib package which use yolov3 model to recognize objects on images on windows 10.
Let's take an easy example:
import cvlib as cv
import time
from cvlib.object_detection import draw_bbox
inittimer=time.time()
bbox, label, conf = cv.detect_common_objects(img,confidence=0.5,model='yolov3-worker',enable_gpu=True)
print('The process tooks %.3f s'%(time.time()-inittimer)
output_image = draw_bbox(img, bbox, label, conf)
The results give ~60ms.
cvlib use opencv to compute this cnn part.
If now I try to see how much GPU tensorflow used, using subprocess, It tooks only 824MiB.
while the program runs, if I start nvidia-smi it gives me this result:
As u can see there is much more memory available here. My question is simple.. why Cvlib (and so tensorflow) doesn't use all of it to improve the time's detection?
EDIT:
As far as I understand, cvlib use tensorflow but it also use opencv detector. I installed opencv using cmake and Cuda 10.2
I don't understand why but in the nvidia-smi it's written CUDA Version : 11.0 which is not. Maybe that's the part of the problem?
You can verify if opencv is using CUDA or not. This can be done using the following
import cv2
print(cv2.cuda.getCudaEnabledDeviceCount())
This should get you the number of CUDA enabled devices in your machine. You should also check the build information by using the following
import cv2
print cv2.getBuildInformation()
The output for both the above cases can indicate whether your opencv can access GPU or not. In case it doesn't access GPU then you may consider reinstallation.
I got it! The problem come from the fact that I created a new Net object for each itteration.
Here is the related issue on github where you can follow it: https://github.com/opencv/opencv/issues/16348
With a custom function, it now works at ~60 fps. Be aware that cvlib is, maybe, not done for real time computation.
workon opencv_cuda
cd opencv
mkdir build
cd build
cmake -D CMAKE_BUILD_TYPE=RELEASE
and share the result.
It should be something like this

ImageDataGenerator.flow_from_directory() segfaulting with no augmentation

I'm trying to construct an autoencoder for ultrasound images, and am unable to use ImageDataGenerator.flow_from_directory() to provide train/test datasets due to segfault on call to the method. No augmentation is being used, which should only result in the original images being provided by the generator.
The source images are in TIFF format, so I first tried converting them to JPG and PNG thinking that maybe PIL was faulting on the encoding, no difference. I have tried converting to different color modes (grayscale, RGB, RGBA) with no change in behavior. I have stripped the code down to the bare minimum, taking defaults for nearly all function params and still getting a segfault on call in both debug and full run.
# Directory below contains a single subdirectory "input" containing 5635 TIFF images
from keras.preprocessing.image import *
print('Create train_gen')
train_gen = ImageDataGenerator().flow_from_directory(
directory=r'/data/ultrasound-nerve-segmentation/train/',
class_mode='input'
)
print('Created train_gen')
Expected output is a report of 5635 images found in one class "input" and both debug messages to print out, with usable generator for use in Model.fit_generator().
Actual output:
Using TensorFlow backend.
Create train_gen
Found 5635 images belonging to 1 classes.
Segmentation fault
Is there something I'm doing above that could be causing the problem? According to every scrap of sample code I can find, it looks like it should be working.
Environment is:
Ubuntu 16.04 LTS
CUDA 10.1
tensorflow-gpu 1.14
Keras 2.2.4
Python 3.7.2
Thanks for any help you can provide!
OK so I haven't pegged specifically why it is segfaulting, but it appears to be related to the virtualenv it is running under. I was apparently using a JupyterHub environment, which seems to not behave even when run from an ssh session (vs from within JupyterHub consoles). Once I created a whole new standalone virtualenv with only the TF + Keras packages installed, it appears to run just fine.

Using PyTorch on AWS Lambda

Has anyone had any luck being able to use PyTorch on AWS Lambda for feature extraction from images or just using the framework at all? I finally got PyTorch, numpy, and pillow zipped in a folder under the uncompressed size limit (which is actually around 262 MB) but I had to build PyTorch from source to do this. The problem I am having now is that Lambda has a very old version of gcc running on it (4.8.3) which is very buggy and missing whole header files altogether. I believe the Pytorch docs state you should be using at least gcc 7 or later but I'm hoping someone may have found a way around this? I built the source using gcc 7.5 but then when I tried to import torch Lambda obviously used it's installed version of 4.8.3 causing an error on import: Floating point exception (core dumped) which stems from the old version of gcc. Is there a possible solution around this? I've been at this for a day and a half now so any help would be great. I think the bottom line is I am facing this similar issue. Better yet does anyone have a Pytorch lambda layer I could use?
I was able to utilize the below layers for using pytorch on AWS Lambda:
arn:aws:lambda:AWS_REGION:934676248949:layer:pytorchv1-py36:1 PyTorch 1.0.1
arn:aws:lambda:AWS_REGION:934676248949:layer:pytorchv1-py36:2 PyTorch 1.1.0
Found these on Fastai production deployment page, thanks to Matt McClean

How to debug A graph in tensorflow in windows

I am trying to debug a program that use tensorflow, the problem is I can't launch the dbdp in pydev, How can to check the contain of my tensors or my layers? If any person know how to debug correctly a tensorflow program in windows, please I help me.
Thanks,

Resources