I'm trying to use CUDA technology, but have some issues
greymachine ~/NVIDIA_CUDA-5.0_Samples/1_Utilities/deviceQuery $ ./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 35
-> CUDA driver version is insufficient for CUDA runtime version
greymachine ~/NVIDIA_CUDA-5.0_Samples/1_Utilities/deviceQuery $
That's my problem.
My config:
$ nvidia-settings -q NvidiaDriverVersion
Attribute 'NvidiaDriverVersion' (greymachine.localdomain:0.0): 310.19
$ uname -r
3.7.1-un-def-alt2.1
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2012 NVIDIA Corporation
Built on Fri_Sep_21_17:28:58_PDT_2012
Cuda compilation tools, release 5.0, V0.2.1221
$ ./deviceQueryDrv
./deviceQueryDrv Starting...
CUDA Device Query (Driver API) statically linked version
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GT 430"
CUDA Driver Version: 5.0
CUDA Capability Major/Minor version number: 2.1
Total amount of global memory: 1024 MBytes (1073283072 bytes)
( 2) Multiprocessors x ( 48) CUDA Cores/MP: 96 CUDA Cores
GPU Clock rate: 1400 MHz (1.40 GHz)
Memory Clock rate: 800 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 131072 bytes
Max Texture Dimension Sizes 1D=(65536) 2D=(65536,65535) 3D=(2048,2048,2048)
Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per multiprocessor: 1536
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535
Texture alignment: 512 bytes
Maximum memory pitch: 2147483647 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device PCI Bus ID / PCI location ID: 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
$ lsmod | grep nvidia
nvidia 9381500 39
i2c_core 30993 3 i2c_i801,nvidia,videodev
dmesg:
[ 28.548939] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 310.19 Thu Nov 8 00:52:03 PST 2012
[ 29.065356] NVRM: GPU at 0000:01:00: GPU-5a5ce500-f7fd-ab9d-64d7-cc0d1fe26ff1
[ 29.065360] NVRM: Your system is not currently configured to drive a VGA console
[ 29.065361] NVRM: on the primary VGA device. The NVIDIA Linux graphics driver
[ 29.065362] NVRM: requires the use of a text-mode VGA console. Use of other console
[ 29.065363] NVRM: drivers including, but not limited to, vesafb, may result in
[ 29.065364] NVRM: corruption and stability problems, and is not supported.
1682.331776] NVRM: GPU at 0000:01:00: GPU-5a5ce500-f7fd-ab9d-64d7-cc0d1fe26ff1
$ ldd ./deviceQuery
linux-vdso.so.1 (0x00007fffd4dff000)
libcudart.so.5.0 => /usr/local/cuda-5.0/lib64/libcudart.so.5.0 (0x00007f5e90c26000)
libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f5e90922000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f5e9070c000)
libc.so.6 => /lib64/libc.so.6 (0x00007f5e90362000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f5e90145000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f5e8ff40000)
librt.so.1 => /lib64/librt.so.1 (0x00007f5e8fd38000)
libm.so.6 => /lib64/libm.so.6 (0x00007f5e8fa3e000)
/lib64/ld-linux-x86-64.so.2 (0x00007f5e90eaa000)
I've installed cuda toolkit, downloaded from the nvidia site, i've precompiled drivers from my distro (Alt Linux), but libcuda.so doesn't come with them, so i copied that library from original nvidia drivers. Compiling is ok. I've also tested 2.6.32 kernel with 304.51: got the same msg, but that is understandable, cuda comes with 304.54 driver.
AFAIU, if i have newer drivers, than came with cuda toolkit, that is ok. But as u see some thing is wrong.
So, can kernel drivers be newer, than original (i.e drivers i got with cuda)
May be i should compile modules myself? but what for, why? my distro modules work good.
Thanks
This appears to have been caused by a very broken CUDA installation. The usual solution for is to uninstall everything, install a supported host toolchain and then reinstall the driver and toolkit. Every toolkit contains detailed installation instructions and system requirements. If you read and follow these, there will rarely be an issue of this type.
[This answer was assembled from comments and added as a community wiki entry to get it off the unanswered question list]
Related
I'm trying to run Pytorch on a laptop that I have. It's an older model but it does have an Nvidia graphics card. I realize it is probably not going to be sufficient for real machine learning but I am trying to do it so I can learn the process of getting CUDA installed.
I have followed the steps on the installation guide for Ubuntu 18.04 (my specific distribution is Xubuntu).
My graphics card is a GeForce 845M, verified by lspci | grep nvidia:
01:00.0 3D controller: NVIDIA Corporation GM107M [GeForce 845M] (rev a2)
01:00.1 Audio device: NVIDIA Corporation Device 0fbc (rev a1)
I also have gcc 7.5 installed, verified by gcc --version
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
And I have the correct headers installed, verified by trying to install them with sudo apt-get install linux-headers-$(uname -r):
Reading package lists... Done
Building dependency tree
Reading state information... Done
linux-headers-4.15.0-106-generic is already the newest version (4.15.0-106.107).
I then followed the installation instructions using a local .deb for version 10.1.
Now, when I run nvidia-smi, I get:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce 845M On | 00000000:01:00.0 Off | N/A |
| N/A 40C P0 N/A / N/A | 88MiB / 2004MiB | 1% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 982 G /usr/lib/xorg/Xorg 87MiB |
+-----------------------------------------------------------------------------+
and I run nvcc -V I get:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
I then performed the post-installation instructions from section 6.1, and so as a result, echo $PATH looks like this:
/home/isaek/anaconda3/envs/stylegan2_pytorch/bin:/home/isaek/anaconda3/bin:/home/isaek/anaconda3/condabin:/usr/local/cuda-10.1/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
echo $LD_LIBRARY_PATH looks like this:
/usr/local/cuda-10.1/lib64
and my /etc/udev/rules.d/40-vm-hotadd.rules file looks like this:
# On Hyper-V and Xen Virtual Machines we want to add memory and cpus as soon as they appear
ATTR{[dmi/id]sys_vendor}=="Microsoft Corporation", ATTR{[dmi/id]product_name}=="Virtual Machine", GOTO="vm_hotadd_apply"
ATTR{[dmi/id]sys_vendor}=="Xen", GOTO="vm_hotadd_apply"
GOTO="vm_hotadd_end"
LABEL="vm_hotadd_apply"
# Memory hotadd request
# CPU hotadd request
SUBSYSTEM=="cpu", ACTION=="add", DEVPATH=="/devices/system/cpu/cpu[0-9]*", TEST=="online", ATTR{online}="1"
LABEL="vm_hotadd_end"
After all of this, I even compiled and ran the samples. ./deviceQuery returns:
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce 845M"
CUDA Driver Version / Runtime Version 10.1 / 10.1
CUDA Capability Major/Minor version number: 5.0
Total amount of global memory: 2004 MBytes (2101870592 bytes)
( 4) Multiprocessors, (128) CUDA Cores/MP: 512 CUDA Cores
GPU Max Clock rate: 863 MHz (0.86 GHz)
Memory Clock rate: 1001 Mhz
Memory Bus Width: 64-bit
L2 Cache Size: 1048576 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: No
Supports Cooperative Kernel Launch: No
Supports MultiDevice Co-op Kernel Launch: No
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.1, CUDA Runtime Version = 10.1, NumDevs = 1
Result = PASS
and ./bandwidthTest returns:
[CUDA Bandwidth Test] - Starting...
Running on...
Device 0: GeForce 845M
Quick Mode
Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 11.7
Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 11.8
Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 14.5
Result = PASS
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
But after all of this, this Python snippet (in a conda environment with all dependencies installed):
import torch
torch.cuda.is_available()
returns False
Does anybody have any idea about how to resolve this? I've tried to add /usr/local/cuda-10.1/bin to etc/environment like this:
PATH=$PATH:/usr/local/cuda-10.1/bin
And restarting the terminal, but that didn't fix it. I really don't know what else to try.
EDIT - Results of collect_env for #kHarshit
Collecting environment information...
PyTorch version: 1.5.0
Is debug build: No
CUDA used to build PyTorch: 10.2
OS: Ubuntu 18.04.4 LTS
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
CMake version: Could not collect
Python version: 3.6
Is CUDA available: No
CUDA runtime version: 10.1.243
GPU models and configuration: GPU 0: GeForce 845M
Nvidia driver version: 418.87.00
cuDNN version: Could not collect
Versions of relevant libraries:
[pip] numpy==1.18.5
[pip] pytorch-ranger==0.1.1
[pip] stylegan2-pytorch==0.12.0
[pip] torch==1.5.0
[pip] torch-optimizer==0.0.1a12
[pip] torchvision==0.6.0
[pip] vector-quantize-pytorch==0.0.2
[conda] numpy 1.18.5 pypi_0 pypi
[conda] pytorch-ranger 0.1.1 pypi_0 pypi
[conda] stylegan2-pytorch 0.12.0 pypi_0 pypi
[conda] torch 1.5.0 pypi_0 pypi
[conda] torch-optimizer 0.0.1a12 pypi_0 pypi
[conda] torchvision 0.6.0 pypi_0 pypi
[conda] vector-quantize-pytorch 0.0.2 pypi_0 pypi
PyTorch doesn't use the system's CUDA library. When you install PyTorch using the precompiled binaries using either pip or conda it is shipped with a copy of the specified version of the CUDA library which is installed locally. In fact, you don't even need to install CUDA on your system to use PyTorch with CUDA support.
There are two scenarios which could have caused your issue.
You installed the CPU only version of PyTorch. In this case PyTorch wasn't compiled with CUDA support so it didn't support CUDA.
You installed the CUDA 10.2 version of PyTorch. In this case the problem is that your graphics card currently uses the 418.87 drivers, which only support up to CUDA 10.1. The two potential fixes in this case would be to either install updated drivers (version >= 440.33 according to Table 2) or to install a version of PyTorch compiled against CUDA 10.1.
To determine the appropriate command to use when installing PyTorch you can use the handy widget in the "Install PyTorch" section at pytorch.org. Just select the appropriate operating system, package manager, and CUDA version then run the recommended command.
In your case one solution was to use
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
which explicitly specifies to conda that you want to install the version of PyTorch compiled against CUDA 10.1.
For more information about PyTorch CUDA compatibility with respect drivers and hardware see this answer.
Edit After you added the output of collect_env we can see that the problem was that you had the CUDA 10.2 version of PyTorch installed. Based on that an alternative solution would have been to update the graphics driver as elaborated in item 2 and the linked answer.
TL; DR
Install NVIDIA Toolkit provided by Canonical or NVIDIA third-party PPA.
Reboot your workstation.
Create a clean Python virtual environment (or reinstall all CUDA dependent packages).
Description
First install NVIDIA CUDA Toolkit provided by Canonical:
sudo apt install -y nvidia-cuda-toolkit
or follow NVIDIA developers instructions:
# ENVARS ADDED **ONLY FOR READABILITY**
NVIDIA_CUDA_PPA=https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/
NVIDIA_CUDA_PREFERENCES=https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
NVIDIA_CUDA_PUBKEY=https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
# Add NVIDIA Developers 3rd-Party PPA
sudo wget ${NVIDIA_CUDA_PREFERENCES} -O /etc/apt/preferences.d/nvidia-cuda
sudo apt-key adv --fetch-keys ${NVIDIA_CUDA_PUBKEY}
echo "deb ${NVIDIA_CUDA_PPA} /" | sudo tee /etc/apt/sources.list.d/nvidia-cuda.list
# Install development tools
sudo apt update
sudo apt install -y cuda
then reboot the OS load the kernel with the NVIDIA drivers
Create an environment using your favorite manager (conda, venv, etc)
conda create -n stack-overflow pytorch torchvision
conda activate stack-overflow
or reinstall pytorch and torchvision into the existing one:
conda activate stack-overflow
conda install --force-reinstall pytorch torchvision
otherwise NVIDIA CUDA C/C++ bindings may not be correctly detected.
Finally ensure CUDA is correctly detected:
(stack-overflow)$ python3 -c 'import torch; print(torch.cuda.is_available())'
True
Versions
NVIDIA CUDA Toolkit v11.6
Ubuntu LTS 20.04.x
Ubuntu LTS 22.04 (prior official release)
In my case, just restarting my machine made the GPU active again. The initial message I got was that the GPU is currently in use by another application. But when I looked at nvidia-smi, there was nothing that I saw. So, no changes to dependencies, and it just started working again.
Another possible scenario is that environment variable CUDA_VISIBLE_DEVICES is not set correctly before installing PyTorch.
In my case it worked to do as follows:
remove the CUDA drivers
sudo apt-get remove --purge nvidia*
Then get the exact installation script of the drivers based on your distro and system from the link: https://developer.nvidia.com/cuda-downloads?target_os=Linux
In my case it was dabian on x64 so I did:
wget https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo add-apt-repository contrib
sudo apt-get update
sudo apt-get -y install cuda
And now nvidia-smi works as intended!
I hope that helps
If your CUDA version does not match what PyTorch expects, you will see this issue.
On Arch / Manjaro:
Get Pytorch from here: https://pytorch.org/get-started/locally/
Note what CUDA version you are getting PyTorch for
Get the same CUDA version from here: https://archive.archlinux.org/packages/c/cuda/
Install CUDA using (e.g.) sudo pacman -U --noconfirm cuda-11.6.2-1-x86_64.pkg.tar.zst
Do not update to a newer version of CUDA than PyTorch expects. If PyTorch wants 11.6 and you have updated to 11.7, you will get the error message.
Make sure that os.environ['CUDA_VISIBLE_DEVICES'] = '0' is set after if __name__ == "__main__":. So your code should look like this:
import torch
import os
if __name__ == "__main__":
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
print(torch.cuda.is_available()) // true
...
I've used NVidia card, with the properary drivers installed on a Debian Stretch.
But because i'm carry my hard drive between different machines (intel, amd, but always on amd64 act), i'm decided to drop the NVidia card, and rollback opengl to MESA in order to use 3D acceleration on every machine. After a lot of struggling i'm successfully identified and recovered some badly overwritten file (libGL.so, libdrm2.so, by NVidia installer).
Now i'm successfully recovered the 64bit related libraries, so glxgears, browser's WebGL support, gnuplot, etc. works well.
But 32bit libraries (wine, steam) still doesn't work well, it's always falls back to "Mesa X11" render.
I'v used glxgears
$ LIBGL_DEBUG=verbose glxinfo | grep "OpenGL renderer string"
to identify which so and DRI selected. It's prints the lookup process and the renderer:
libGL: OpenDriver: trying /usr/lib/x86_64-linux-gnu/dri/tls/r600_dri.so
libGL: OpenDriver: trying /usr/lib/x86_64-linux-gnu/dri/r600_dri.so
libGL: Using DRI2 for screen 0
OpenGL renderer string: Gallium 0.4 on AMD SUMO (DRM 2.50.0 / 4.12.0-0.bpo.1-amd64, LLVM 3.9.1)
To investigate in 32 bit libraries (we can't have from mesa both the 64 and 32 bit installed), i've downloaded the 32bit version:
$ apt-get download mesa-utils:i386
Unpacked it and tries to figure out why it's fails to select the proper DRI:
LIBGL_DEBUG=verbose ./glxinfo | grep "OpenGL renderer string"
OpenGL renderer string: Mesa X11
The pevious 64bit glxinfo prints debugging information to the stderr therefore we can see how the selection happens.
With 32bit version i can't get any useful information, even if i specify the
LIBGL_DRIVERS_PATH=/usr/lib/i386-linux-gnu/dri/
evironment variable, where mesa might find the proper 32 bit so.
$ file /usr/lib/i386-linux-gnu/dri/r600_dri.so
/usr/lib/i386-linux-gnu/dri/r600_dri.so: ELF 32-bit LSB shared object, Intel 80386, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=d5177f823f11ac8ea7412e517aa6684154de506e, stripped
How can i get more information about the MESA DRI selection?
Details
What drivers/packages do I have to install in order to enable OpenCL over multiple platforms: CPU (Intel), Integrated GPU (Intel), Dedicated GPU (NVIDIA)?
It would be nice to have all platforms running OpenCL 1.2 or above
I know it is probably a simple fix, maybe just the right selection of libraries/SKDs, but I am having some trouble getting more them one platform to work.
Running ubuntu 14.04: I have a intel core I5 with the integrated intel graphics and a dedicated NVIDIA Geforce 710m board.
Resources I have used
https://wiki.tiker.net/OpenCLHowTo
Here (under Debian) it tells me that I only need:
Packages of ICD loaders: (you just need one of these)
Packages of ICDs
Package for headers
What I have already tried
Installed CUDA7.5 (yes for all)
Had a black screen due drivers conflics
Resolved by uninstalling all nvidia drivers and installing 352
Still left Cuda SDK install
From: How to make OpenCL work on 14.10 + Nvidia 331.89 drivers?
sudo apt-get install nvidia-331 nvidia-331-uvm nvidia-opencl-dev nvidia-modprobe
Those packages downgraded my drivers to 331 and 340
Also from: How to make OpenCL work on 14.10 + Nvidia 331.89 drivers?
Linked libraries with:
sudo ln -s /usr/include/nvidia-352/GL /usr/local/include
sudo ln -s /usr/lib/x86_64-linux-gnu/libOpenCL.so.1 /usr/local/lib/libOpenCL.so
OpenCL 1.1 worked for NVIDIA GPU
Could not get OpenCL 1.2, thus decided to uninstall 331 and 340 and install again 352
Installed 352 (again)
OpenCL 1.1 stopped working for NVIDIA GPU (and still does not work)
Installed Intel opencl_runtime_14.2_x64_4.5.0.8.tgz
Created a simbolic link to intel ICD with:
sudo ln -s /opt/intel/opencl-1.2-4.5.0.8/etc/intel64.icd
OpenCL 1.2 worked for CPU (and still works)
Installed clinfo sudo apt-get install clinfo
only intel CPU platform is detected
Tried to install several different NVIDIA packages to get NVIDIA GPU work again but had no luck with that
Installed packages and some information:
ICD in Vendors?
ls -l /etc/OpenCL/vendors/
total 4
-rw-r--r-- 1 root root 15 Out 22 2015 Altera.icd
lrwxrwxrwx 1 root root 45 Abr 28 13:48 intel64.icd -> /opt/intel/opencl-1.2-4.5.0.8/etc/intel64.icd
Note the missing nvidia.icd
CL and GL - GL had a valid link... now is IN RED
ls -l /usr/local/include
total 4
lrwxrwxrwx 1 root root 31 Abr 28 12:48 CL -> /usr/local/cuda-7.5/include/CL/
lrwxrwxrwx 1 root root 26 Abr 27 11:44 GL -> /usr/include/nvidia-352/GL (IN RED COLOR - folder doesn't exist anymore)
.so Files
ls -l /usr/local/lib/ | grep CL
lrwxrwxrwx 1 root root 40 Abr 27 11:45 libOpenCL.so -> /usr/lib/x86_64-linux-gnu/libOpenCL.so.1
Installed packages
dpkg --get-selections | grep nvidia
nvidia-340 deinstall
nvidia-352 install
nvidia-libopencl1-340 deinstall
nvidia-libopencl1-340-updates deinstall
nvidia-libopencl1-352 deinstall
nvidia-libopencl1-352-updates install
nvidia-modprobe install
nvidia-opencl-icd-340 deinstall
nvidia-opencl-icd-352 deinstall
nvidia-prime install
nvidia-settings install
dpkg --get-selections | grep opencl
nvidia-libopencl1-340 deinstall
nvidia-libopencl1-340-updates deinstall
nvidia-libopencl1-352 deinstall
nvidia-libopencl1-352-updates install
nvidia-opencl-icd-340 deinstall
nvidia-opencl-icd-352 deinstall
ocl-icd-libopencl1:amd64 deinstall
ocl-icd-libopencl1:i386 deinstall
opencl-headers install
unity-scope-openclipart install
clinfo
clinfo
Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 1.2 LINUX
Platform Name: Intel(R) OpenCL
Platform Vendor: Intel(R) Corporation
Platform Extensions: cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_spir cl_intel_exec_by_local_thread cl_khr_depth_images cl_khr_3d_image_writes cl_khr_fp64
Platform Name: Intel(R) OpenCL
Number of devices: 1
Device Type: CL_DEVICE_TYPE_CPU
Device ID: 32902
Max compute units: 4
Max work items dimensions: 3
Max work items[0]: 8192
Max work items[1]: 8192
Max work items[2]: 8192
Max work group size: 8192
Preferred vector width char: 1
Preferred vector width short: 1
Preferred vector width int: 1
Preferred vector width long: 1
Preferred vector width float: 1
Preferred vector width double: 1
Native vector width char: 16
Native vector width short: 8
Native vector width int: 4
Native vector width long: 2
Native vector width float: 8
Native vector width double: 4
Max clock frequency: 1800Mhz
Address bits: 64
Max memory allocation: 2040185856
Image support: Yes
Max number of images read arguments: 480
Max number of images write arguments: 480
Max image 2D width: 16384
Max image 2D height: 16384
Max image 3D width: 2048
Max image 3D height: 2048
Max image 3D depth: 2048
Max samplers within kernel: 480
Max size of kernel argument: 3840
Alignment (bits) of base address: 1024
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: Yes
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: No
Round to +ve and infinity: No
IEEE754-2008 fused multiply-add: No
Cache type: Read/Write
Cache line size: 64
Cache size: 262144
Global memory size: 8160743424
Constant buffer size: 131072
Max number of constant args: 480
Local memory type: Global
Local memory size: 32768
Error correction support: 0
Unified memory for Host and Device: 1
Profiling timer resolution: 1
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: Yes
Queue properties:
Out-of-Order: Yes
Profiling : Yes
Platform ID: 0x1659390
Name: Intel(R) Core(TM) i5-3337U CPU # 1.80GHz
Vendor: Intel(R) Corporation
Device OpenCL C version: OpenCL C 1.2
Driver version: 1.2.0.8
Profile: FULL_PROFILE
Version: OpenCL 1.2 (Build 8)
Extensions: cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_spir cl_intel_exec_by_local_thread cl_khr_depth_images cl_khr_3d_image_writes cl_khr_fp64
So...
How can I get NVIDIA gpu to also show up as a OPENCL 1.2 (or higher) platform? How about the integrated intel graphics?
Would AMD libraries work with my hardware?
Why most of the nvidia packages are marked as deinstall?
As mentioned before, three things are necessary:
From https://wiki.tiker.net/OpenCLHowTo
Packages of ICD loaders: (you just need one of these)
Packages of ICDs
Package for headers
Thus for an Intel CPU and a NVIDIA GPU
Packages of ICD loaders:
ocl-icd-libopencl1
Packages of ICDs
Installed Intel OpenCL runtime
nvidia-opencl-icd-352
Package for headers
opencl-headers
However, to get it to work, it is necessary to make sure the packages are marked as install by dpkg --get-selections | grep opencl
sudo apt-get install --reinstall nvidia-opencl-icd-352 opencl-headers ocl-icd-libopencl1
On top of that, you must make sure that intel64.icd and nvidia.icd are in /etc/OpenCL/vendors (ls -l /etc/OpenCL/vendors).
That said, I had to link intel64.icd with:
cd /etc/OpenCL/vendors/
sudo ln -s /opt/intel/opencl-1.2-X.X.X.X/etc/intel64.icd
And, since nvidia.icd was not in the folder (even after the installation of the right package) I had to extract it and manually move from the deb package
dpkg -x /var/cache/apt/archives/nvidia-opencl-icd-352_352.63-0ubuntu0.14.04.1_amd64.deb ~/tempfolder
sudo mv ~/tempfolder/etc/OpenCL/vendors/nvidia.icd /etc/OpenCL/vendors/nvidia.icd
rm -r ~/tempfolder
Finally, make sure nvidia is the active GPU
sudo prime-select nvidia
sudo reboot -r now
Install and execute clinfo and both platforms should show-up.
clinfo
Number of platforms: 2
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 1.2 CUDA 7.5.23
Platform Name: NVIDIA CUDA
Platform Vendor: NVIDIA Corporation
Platform Extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 1.2 LINUX
Platform Name: Intel(R) OpenCL
Platform Vendor: Intel(R) Corporation
Platform Extensions: cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_spir cl_intel_exec_by_local_thread cl_khr_depth_images cl_khr_3d_image_writes cl_khr_fp64
PC Config:
AMD Dual-core E-350 1.6Ghz
AMD 6310 HD Graphics Card
I'm running Ubuntu 14.04 on VMware Player on top of Windows 8 and trying to figure out if i can run OpenCL on ubunutu ie using VM.
I was reading a manual of how to install OpenCL and the first step gives this output. Can someone see the output and tell me what i should do? Can i run OpenCL on VM?
1)First,we need to check our system configuration to determine whether it is possible to install OpenCL in our machine. We can run the “Hardinfo” command on the terminal to get the complete summary of our system's configuration. If Hardinfo command is not installed in our system,then we can easily install it by the running the following command on the terminal: sudo apt-get install hardinfo
2)If our system can support OpenCL then we can go into the actual installation of OpenCL.
3)Download the Intel SDK for OpenCL Applications from Intel's web site.
.....
varun#varun-virtual-machine:/$ hardinfo
The program 'hardinfo' is currently not installed. You can install it by typing:
sudo apt-get install hardinfo
varun#varun-virtual-machine:/$ sudo apt-get install hardinfo
Reading package lists... Done
Building dependency tree
Reading state information... Done
You might want to run 'apt-get -f install' to correct these:
The following packages have unmet dependencies:
fglrx : Depends: fglrx-core but it is not installable
Recommends: fglrx-amdcccle but it is not going to be installed
E: Unmet dependencies. Try 'apt-get -f install' with no packages (or specify a solution).
It is possible to run OpenCL in a virtual machine, but you will only have access to the CPU device.
The Intel SDK will not work, because it has strict hardware requirements, but the AMD SDK installs fine. I am running with AMD-APP-SDK-v2.8-RC-lnx64 on a Ubuntu 14.04 VM under Windows 8 + VirtualBox. You can get the SDK from here: http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk/download-archive/
Since Ubuntu 18.04, all you need to get a version of OpenCL running on a CPU-only environment like a VM is:
sudo apt-get install libpocl2
Then clinfo will tell you, for example:
Number of platforms 1
Platform Name Portable Computing Language
Platform Vendor The pocl project
Platform Version OpenCL 1.2 pocl 1.4, None+Asserts, LLVM 9.0.1, RELOC, SLEEF, DISTRO, POCL_DEBUG
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd
Platform Extensions function suffix POCL
Platform Name Portable Computing Language
Number of devices 1
Device Name pthread-Intel(R) Core(TM) i7-7700HQ CPU # 2.80GHz
Device Vendor GenuineIntel
Device Vendor ID 0x6c636f70
Device Version OpenCL 1.2 pocl HSTR: pthread-x86_64-pc-linux-gnu-skylake
Driver Version 1.4
Device OpenCL C Version OpenCL C 1.2 pocl
Device Type CPU
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 8
Max clock frequency 2807MHz
Device Partition (core)
Max number of sub-devices 8
Supported partition types equally, by counts
Supported affinity domains (n/a)
Max work item dimensions 3
Max work item sizes 4096x4096x4096
Max work group size 4096
Preferred work group size multiple 8
Preferred / native vector sizes
char 16 / 16
short 16 / 16
int 8 / 8
long 4 / 4
half 0 / 0 (n/a)
float 8 / 8
double 4 / 4 (cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations Yes
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Address bits 64, Little-Endian
Global memory size 24638332928 (22.95GiB)
Error Correction support No
Max memory allocation 8589934592 (8GiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Global Memory cache type Read/Write
Global Memory cache size 6291456 (6MiB)
Global Memory cache line size 64 bytes
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 536870912 pixels
Max 1D or 2D image array size 2048 images
Max 2D image size 16384x16384 pixels
Max 3D image size 2048x2048x2048 pixels
Max number of read image args 128
Max number of write image args 128
Local memory type Global
Local memory size 4194304 (4MiB)
Max number of constant args 8
Max constant buffer size 4194304 (4MiB)
Max size of kernel argument 1024
Queue properties
Out-of-order execution Yes
Profiling Yes
Prefer user sync for interop Yes
Profiling timer resolution 1ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels Yes
printf() buffer size 16777216 (16MiB)
Built-in kernels (n/a)
Device Extensions cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_fp64 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) Portable Computing Language
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [POCL]
clCreateContext(NULL, ...) [default] Success [POCL]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
Platform Name Portable Computing Language
Device Name pthread-Intel(R) Core(TM) i7-7700HQ CPU # 2.80GHz
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) Success (1)
Platform Name Portable Computing Language
Device Name pthread-Intel(R) Core(TM) i7-7700HQ CPU # 2.80GHz
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name Portable Computing Language
Device Name pthread-Intel(R) Core(TM) i7-7700HQ CPU # 2.80GHz
ICD loader properties
ICD loader Name OpenCL ICD Loader
ICD loader Vendor OCL Icd free software
ICD loader Version 2.2.11
ICD loader Profile OpenCL 2.1
Tested:
WSL Ubuntu 20.04 on Windows
Ubuntu 18.04, 20.04 on Github Actions (Azure VMs)
Many thanks to the PoCL team for making this possible!
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
I recently installed the cuda toolkit 5.5 with driver 331.67 (I have a GeForce GTX 680). For some reason, I cannot run any of the test scrips:
$./NVIDIA_CUDA-5.5_Samples/1_Utilities/deviceQuery/deviceQuery
./NVIDIA_CUDA-5.5_Samples/1_Utilities/deviceQuery/deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 38
-> no CUDA-capable device is detected
Result = FAIL
I followed the steps on the "getting started guide" here
http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/
and made a script to create the character device files at startup (as I am running the server edition of Ubuntu such graphics files aren't created by default):
$ls -l /dev/nvidia*
crw-rw-rw- 1 root root 195, 0 Apr 11 17:29 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Apr 11 17:29 /dev/nvidiactl
The output for executing the command nvidia-smi -a is (for both normal user and root user):
Failed to initialize NVML: Unknown Error
Here is some info on the nvidia module
$ lsmod | grep nvidia
nvidia 11335080 0
$ modinfo nvidia
filename: /lib/modules/3.11.0-17-generic/updates/dkms/nvidia.ko
alias: char-major-195-*
version: 331.67
supported: external
license: NVIDIA
...
...
Any suggestions ? Thanks.
EDIT #1
I tried downgrading to driver 319.76:
$ modinfo nvidia
filename: /lib/modules/3.11.0-17-generic/updates/dkms/nvidia.ko
alias: char-major-195-*
version: 319.76
supported: external
...
Now when I run nvidia-smi -a I get the following:
NVIDIA: API mismatch: the NVIDIA kernel module has version 304.116,
but this NVIDIA driver component has version 319.76. Please make
sure that the kernel module and all NVIDIA driver components
have the same version.
Failed to initialize NVML: Unknown Error
I installed the nvidia-current-updates and nvidia-settings-updates packages from the repos before installing the driver file and I guess that's where the conflicting arose. I have not found a solution, but this is one step closer I think. Here is the result of modprobe -l | grep nvidia
kernel/drivers/video/nvidia/nvidiafb.ko
kernel/drivers/net/ethernet/nvidia/forcedeth.ko
updates/dkms/nvidia.ko
updates/dkms/nvidia_304_updates.k
So it turns out the main error I was encountering was due to the fact that there was a version mismatch between the nvidia kernel module and the driver component. Here are the steps I took which helped me find a resolution.
1) downgrading the driver allowed me to see nvidia-smi -a complain about a driver component mismatch. I wasn't sure this would be a problem originally. I was simply following a CUDA toolkit setup guide, which didn't mention this being a problem.
2) Having installed the kernel modules from the repos, I just picked the corresponding driver component with correct version. If you don't know the version of your installed kernel module you can use modprobe and modinfo. For example, on my system
$ modprobe -l | grep nvidia
kernel/drivers/video/nvidia/nvidiafb.ko
kernel/drivers/net/ethernet/nvidia/forcedeth.ko
updates/dkms/nvidia.ko
updates/dkms/nvidia_304_updates.ko
The module nvidia_304_updates was installed from the repos (package nvidia-updates-current). Its exact version is found with modinfo
$ modinfo /lib/modules/3.11.0-17-generic/updates/dkms/nvidia_304_updates.ko
filename: /lib/modules/3.11.0-17-generic/updates/dkms/nvidia_304_updates.ko
alias: char-major-195-*
version: 304.116
supported: external
After downloading and installing the corresponding driver component from the archive on the nvidia website,
http://www.nvidia.com/Download/Find.aspx?lang=en-us
, I was able to run the command
$ nvidia-smi -a
==============NVSMI LOG==============
Timestamp : Mon Apr 14 15:17:44 2014
Driver Version : 304.116
Attached GPUs : 1
GPU 0000:04:00.0
Product Name : GeForce GTX 680
...
...
And the original script I was trying to execute
$ ./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 680"
CUDA Driver Version / Runtime Version 5.0 / 5.0
CUDA Capability Major/Minor version number: 3.0
Total amount of global memory: 2047 MBytes (2146762752 bytes)
( 8) Multiprocessors x (192) CUDA Cores/MP: 1536 CUDA Cores
...
...