Pytorch "hipErrorNoBinaryForGpu: Unable to find code object for all current devices!" - pytorch

SYSTEM: Ryzen 5800x, rx 6700xt, 32 gigs of RAM, Ubuntu 22.04.1
I'm attempting to install Stable-Diffusion by following https://youtu.be/d_CgaHyA_n4
When attempting to run the SD script, I get the "hipErrorNoBinaryForGpu: Unable to find code object for all current devices!" error.
I believe this is caused by PyTorch not working as expected. When validating Pytorchs' installation with "The Master Test", I get the same error:
"hipErrorNoBinaryForGpu: Unable to find code object for all current devices!"
Aborted (core dumped)
I believe that it is install correctly as using the conda list command tells me that torch 1.12.0a0+git2a932eb and torchvision 0.13.0a0+f5afae5 are installed. Interestingly, when I change the command ever-so-slightly to torch.cuda.is_available, (without the parentheses), I get the following output: <function is_available at 0x7f42278788b0>. Granted, I'm not sure what this is telling me. Following the "Verification" step resulted in the expected array of random numbers. However, failed the GPU driver check.
Thank you in advance.

Try running the following command:
export HSA_OVERRIDE_GFX_VERSION=10.3.0
This made it work on my machine using an RX 6600 XT, with which I got the same error running it, before exporting the variable.

I was struggling with ROCm enabled Tensorflow, the tensorflow-rocm, package. Without setting up the environment variable
export HSA_OVERRIDE_GFX_VERSION=10.3.0
, tensorflow-rocm crashes.
After setting it up, it works with 6700XT as well (at least it does not crash.) But still having problem with finding libraries.
It seems ROCm has a lot to catch up.

Related

How to properly install COLMAP on ubuntu when using a conda environment?

The issue is I want to use colmap on a set of images, but am not sure the proper way to go about it in linux in this particular circumstance; colmap works fine in another conda environment, but I don't understand the differences, and am uncertain what my problems are due to. I am asking (please) for clarity in understanding how to approach this, or solutions if you happen to have one.
I have tried two ways of installing colmap, colmap's offical linux instillation guide, https://colmap.github.io/install.html, and by running sudo apt install colmap.When doing a simple sudo apt install, everything generally works, but with the code I am trying to use I run into an error where the feature matching fails (EX1).
However, when I uninstall this colmap and try doing it via the other method, it fails at "cmake .. ", which to solve this I have tried specifying my gcc and g++ (CC=/usr/bin/gcc-8 CXX=/usr/bin/g++-9 cmake ..) but this had no effect. The error in this case was a long list of errors similar to (EX2), which seems to indicate an issue with my cuda. Cuda works fine with me normally in base, but I am wondering if maybe being in a conda environment (which I am installing colmap in) would effect its ability to link to targets?
Something else I had heard is that the precompiled version of colmap won't work on linux, so it is neccesary to follow the method in colmap's linux guide rather than doing sudo apt install colmap, but I am confused on that.
I'm very new to linux so generally any advice on how to approach solving this would be helpful!
EX1)
`Exhaustive feature matching
==============================================================================
Shader not supported by your hardware!
ERROR: SiftGPU not fully supported
ERROR: SiftGPU not fully supported
==== running: mkdir colmap_sparse
==== running: colmap mapper --database_path colmap.db --image_path "images" --output_path colmap_sparse
==============================================================================
Loading database
==============================================================================
Loading cameras... 1 in 0.000s
Loading matches... 0 in 0.000s
Loading images... 10 in 0.000s (connected 0)
Building correspondence graph... in 0.000s (ignored 0)
Elapsed time: 0.000 [minutes]
WARNING: No images with matches found in the database.
`
------------------------------------------------
`
**EX2)**
`CMake Error at cmake/CMakeHelper.cmake:121 (add_library):
Target "colmap" links to target "CUDA::cudart" but the target was not
found. Perhaps a find_package() call is missing for an IMPORTED target, or
an ALIAS target is missing?
Call Stack (most recent call first):
src/CMakeLists.txt:65 (COLMAP_ADD_STATIC_LIBRARY)`
OS: Ubuntu 20.04 LT
GPU: RTX A2000
Cuda: 11.8

Error--rescode.err_missing_license_file(1008): License cannot be located

I was running a python program which uses CVXPY to solve a optimization problem involving semi-definite constraints. Initially the code ran well when I was using the default solver provided by CVXPY. Then I tried to use MOSEK as the optimization solver. Since it has to be installed, I tried installing it from command prompt using a pip installer. However the installation was interrupted midway (I am unaware of the specific reasons). Now whenever I am trying to run the code, it is prompting an error-
rescode.err_missing_license_file(1008): License cannot be located. The default search path is ';C:\Users\dsouv\mosek\mosek.lic;'.
I can understand that somehow the default search path has been changed due to the failed installation of MOSEK. Even after calling the default solver of CVXPY, I am still getting the same error.
Things I have tried:
Reinstalling CVXPY.
Reinstalling MOSEK from the Anaconda Powershell Prompt.
Even after trying out these, the error still persist. Any suggestions to solve this issue is welcome. Also please me if you need any other informtion.
Thanks
You should install the license file separately. I.e. do step 3 at
https://www.mosek.com/resources/getting-started/

opencv Code keep crashing and didn't execute afterward

I was working on a project that was running fine but now when I try to setup the whole code again for the client. my opencv code execute and crashes the first time I try to run it.afterwards the code doesn't even execute until I restart my system again.then code execute and crash again then it doesn't even execute. the terminal shows the msg(attached image) and nothing happen afterwards
os: ubuntu 18.04
python version: python3
opencv version 4
The problem seems to come from pytroch library as te error comes from torch.nn.modules.
I recommend you to check if you have the last version of Pytorch or if there are some issue within the test_2.py code in line 139 when calling:
utils.predict(model_ft, labels, transforms, test_dir, imgFile)

modelsim-altera was not found (linux)

I've installed Quartus II 64-Bit on my PC under LinuxMint 17.3 OS. I don't have any problem with it. But I cann't run simulation with ModelSim-Altera. I get the message "ModelSim-Altera was not found...". There is the detailed instruction for installing ModelSim-Altera which I've performed. But it hasn't helped me. I've also tried to set "/" at end of path to ModelSim-Altera. Now I don't now what I have to do to make it able to run.
Additional information:
I also have the following error when running vsim:
** Fatal: Read failure in vlm process (0,0)
Segmentation fault (core dumped)
I've tried to perform the instructions under link (problem number two), but I have the mentioned error yet.
After performing all instruction under the link above (problem number one and then problem number two) I got a well running vsim. But unfortunatly I cann't run ModelSim from Quartus II.
I know it's an old thread, but I came here looking for the answer and ended up everywhere else.
Adding this to .bashrc appears to have worked:
export PATH=$PATH:~/altera/13.0sp1/modelsim_ase/bin:~/altera/13.0sp1/quartus/bin
It may also be worth noting that I re-downloaded modelsim-altera, even though it said I already had it. I was not able to run modelsim_ase/linuxaloem/vsim (libXext.so.6 not found, although it too was installed) but it now appears to work. 64 bit Ubuntu 16.04.
Hopefully this helps someone else.
Edit: also export QUARTUS_64BIT=1 in .bashrc

Error in using 'theano-nose' command

After installing Theano from Enthought Canopy on Windows, following the steps here: http://deeplearning.net/software/theano/install.html#id9 , I tried to execute the command theano-nose from Canopy terminal. I got an error message saying "unable to find theano-nose". Can someone tell me what might be going wrong?
theano-nose is from the command line. But your shell need to be able to find it.
It is highly possible it didn't got installed correctly on Windows or that you need to reboot or log out/log in. I do not remember Windows particularities on that level.
But a simple work around is to start python and run this:
import theano
theano.test()
This is equivalent for what you want to do.

Resources