Segfault in pytorch on M1: torch.from_numpy(X).float() - pytorch

I'm using an M1.
I'm trying to use pytorch for a conv net.
I have a numpy array that I'm trying to turn into a torch tensor.
When I call
torch.from_numpy(X)
pytorch throws an error that it got a double when it expected a float.
When I call
torch.from_numpy(X).float() on a friends computer, everything is fine.
But when I call this command on my computer, I get a segfault.
Has anyone seen this / know what might be happening / know how to fix?

What's your pytorch vision? I've encountered the same problem on my Macbook Pro M1, and my pytorch version is 1.12.0 at first. The I downgraded it to version 1.10.0 and the problem is solved. I suspect this has something to do with the compatibility with M1 in newer torch versions.
Actually I first uninstalled torch using pip3 uninstall torch and then reinstalled with pip3 install torch==1.10.0
But if you are using torchvision or some other affiliated packages, you may also need to downgrade them too.

Related

pytorch unable to run inference with GPU

I'm developing a project based on yolov7, but I started facing this error where torch recognizes my GPU but torchvision throws an Not Implemented Error.
This is the error
NotImplementedError: Could not run 'torchvision::nms' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'torchvision::nms' is only available for these backends: [CPU, QuantizedCPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLazy, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PythonDispatcher].
I tried installing torchvision with cuda built-in but that gave me the same error, also tried reinstalling pytorch , that didn't work either
the version of torch vision installed in my env was not equipped with cuda as it was a common install with pip with pip install torchvision whereas for torchvision to function with cuda it has to be equipped with cuda in-order for it to function with an Nvidia GPU to do so install torch with the following command conda install pytorch torchvision torchaudio pytorch-cuda={CUDA version} -c pytorch -c nvidia

Cannot import torch Error loading ..\caffe2_nvrtc.dll" or one of its dependencies

I am on a Windows 10 64 bit system.
Pytorch for cuda has been working successfully for some time.
Today I tried to upgrade to the latest version of Pytorch (1.13) using
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
Now I cannot import torch. I get the error:
OSError: [WinError 126] The specified module could not be found. Error loading "C:\Users\alan\anaconda3\lib\site-packages\torch\lib\caffe2_nvrtc.dll" or one of its dependencies.
I have tried both in a Jupyter notebook and in the Spyder IDE.
I have tried completely removing Anaconda and reinstalling afresh and then reinstalling Pytorch with no success.
I do not believe I have any other versions of python installed.
The offending dll (caffe2_nvrtc.dll) does seem to be in the file location specified.
I have found various similar problems reported but they all date back to 2020 or earlier and none of them seemed to have a satisfactory solution.
Can anyone point me in the correct direction
I still do not understand why using conda did not work but I tried again using pip
and that did work
I experienced the same problem as yours today.
It turns out that when I use the anaconda prompt then the problem disappears.
Anaconda prompt could do it
Then I speculated that the only difference between these two scenarios is that when I use the anaconda prompt, I use the base anaconda environment, probably somehow the conda environment in spyder is not activated.
So the solution is to open the spyder in anaconda prompt like this:
Then it works.
This picture will tell you the cause for the problem.
problem...

RuntimeError: cuDNN version incompatibility

I wrote an LSTM NLP classifier with PyTorch, in google colab and it worked well. Now, I run it on google colab pro, but I get this error:
RuntimeError: cuDNN version incompatibility: PyTorch was compiled against (8, 3, 2) but found runtime version (8, 0, 5). PyTorch already comes bundled with cuDNN. One option to resolving this error is to ensure PyTorch can find the bundled cuDNN.one possibility is that there is a conflicting cuDNN in LD_LIBRARY_PATH.
I have no idea how to fix this. I'm using GPU on colab pro.
I've tried this link and it didn't work.
How I declared device:
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
Fixed via upgrading cuDNN to 8.4
reference (https://github.com/JaidedAI/EasyOCR/issues/716)
if you are using google colab uae this command
!pip install --upgrade torch torchvision

How do I fix a keras error for a plaidbench keras test?

I am trying to install plaidml-keras so I can use non-Nvidia GPUs with Keras in python/jupyter. After clearing several other hurdles I get as far as:
plaidbench keras mobilenet
but it errors twice
ImportError: cannot import name 'object_list_uid' from 'keras.utils.generic_utils' (/Users/me/sprinthive/src/notebooks/nbenv/lib/python3.7/site-packages/keras/utils/generic_utils.py)
File "/Users/me/sprinthive/src/notebooks/nbenv/lib/python3.7/site-packages/plaidbench/frontend_keras.py", line 321, in __init__
raise core.ExtrasNeeded(['plaidml-keras'])
plaidbench.core.ExtrasNeeded: Missing needed packages for benchmark; to fix, pip install plaidml-keras
This is in spite of already having plaidml-keras installed:
pip freeze | grep plaid
plaidbench==0.6.4
plaidml==0.6.4
plaidml-keras==0.6.4
[I am using 0.6.4 to make it work on macOS 10.13 High Sierra]
How can I resolve the above errors?
Thanks!
I worked this out by creating a virtual environment with Anaconda. Beware that i am working on Windows, so this might not be a solution for your problem. If i had to guess, something was installed by me before that causes a python package problem. I think this is related to the tensorflow library, but i haven't dug into that. I would recommend trying out a fresh virtual environment on your Mac, where you install the plaidml package. The error Message before was exactly the same.

How do I use a previous version of Keras (0.3.1) on Colaboratory?

I tried pip installing 0.3.1, but when I print the version it outputs 2.1.4.
!pip install keras==0.3.1
import keras
print keras._version__
I am trying to train deepmask (https://github.com/abbypa/NNProject_DeepMask/) for which I specifically need 0.3.1.
Note that if you've already loaded keras, then the second import statement has no effect.
So first !pip install keras==0.3.1, then restart your kernel (ctrl-m . or Runtime -> Restart runtime) and then things should work as expected.

Resources