Is CUDA in installed correctly on my Ubuntu 10.04? Some samples don't run. - linux

I am trying to install CUDA on a server running Ubuntu 10.04.
I followed the NVDIA instructions and installed the "CUDA toolkit for Ubuntu Linux 10.04", "GPU Conputing SDK code samples",and "Developer Drivers for Linux (260.19.26) (64 bit)", my system is 64 bit. This installation seems successful. everything downloaded from http://developer.nvidia.com/object/cuda_3_2_downloads.html#Linux
According to the messages of the installation packages, I added /usr/local/cuda/bin to PATH, /usr/local/cuda/lib64:/usr/local/cuda/lib to LD_LIBRARY_PATH
Then, I tried to run the sample programs. The strange things is, some of them can be run, and some of them don't even through they can be made with no problem.
For example,
- convolutionSeparable will just stop there without any message, I can kill it by ctrl + c.
matrixMul outputs a line
Device 0: "Quadro 5000" with Compute 2.0 capability
and stop there, again can be killed by Ctrl+C
clock works, outputs
PASSED
time = 12574
Press ENTER to exit...
simpleMultiCopy outputs PASSED
MonteCarlo outputs PASSED
simpleZeroCopy outputs PASSED
bandwidthTest stops there with blinking cursor for ever.
What is wrong with this?! How can I check if my CUDA installation is successful ? What is wrong with those programs don't run? They don't even have a error message.

I would start by upgrading the driver to 260.19.36, which can be found here. Then I would suggest running nvidia-smi -a to see if the driver is happy. Then I second the suggestion to run deviceQuery to see if the CUDA Toolkit 3.2 is working.
If deviceQuery output appears nominal, then I would start adding printf's to see where things go awry in matrixMul.

What does deviceQuery say? Also check the output of dmesg right after you run that program to see if you can figure out whats up.
Another tip, if you still are having issues, is try running:
strace ./deviceQuery 2> out.txt
Then check out.txt to see if you can find any clues why this error is occuring.

I have similar problem but solved by updating kernel and drivers.
install newer kernel on 10.04
linux-image-generic-pae-lts-backport-natty
linux-headers-generic-pae-lts-backport-natty
download the latest nvidia driver
from http://www.nvidia.com/Download/index.aspx?lang=en-us
install the latest CUDA (at moment 4.0) from
http://developer.nvidia.com/cuda-toolkit-40
CUDA Toolkit for Ubuntu Linux 10.10 32-bit
CUDA Tools SDK 32-bit
GPU Computing SDK code samples
then I passed all SDK example tests.
ThinkPad w520 Quadro 1000 on Ubuntu 10.04

Related

Virtual machine "pc1" Netkit error?

Introduction
I've just installed a networking simulator Called Netkit. On Debian stretch stable. Using the official installation guide here.
Installation
After setting the correct paths and installing. I then run the check_configuration.sh script.
Everything is checked OK, and it has found the terminal emulator xterm which is needed for netkit. And recieve the complete message.
[ READY ] Congratulations! Your Netkit setup is now complete!
Enjoy Netkit!
The Problem
Running netkit using the command:
vstart pc1
The xterm netkit-kernel emulator starts running. However I'm getting an infinite loop of the same error message:
ubda: can't open "home/foo/netkit/pc1.disk" failed, errno= 13
So im guessing it's because the file is missing? if so how do i obtain it? and if not, what is causing this error. I've followed the install guide completely.
I'm assuming your system is not a 32bit system. Netkit is only supported on the 32-bit architecture(unless the compatibility libraries are installed). Hence I would suggest you download a 32-bit VM(instead of installing the libraries) and run Netkit on the same(worked fine for me).
Check position of your lab-folder..

ctrl+c not killing a process

I have a process that responds perfectly well to CTRL+C on my local machine. And it appears to also be working.
But on an EC2 instance it freezes and becomes a defunct or zombie process.
kill -9 <PID> doesn't remove it and I have to reboot the EC2 instance to clean it up properly.
When it runs it also loads an in house developed shared library that I have no influence over and have no access to any source code in it to see what it's doing. This library also uses CUDA and appears to start multiple threads.
I tried installing a signal handler on the main thread and it does get installed but calling _exit doesn't shut the whole process down, it seems to still be waiting.
Why might be happening here that is preventing CTRL+C from exiting the process cleanly? Can I override or examine what the other threads could be doing?
Ah, I found the problem. I'll leave the question as it stands in case it helps someone else.
It turns out that on my PC, I have a GTX 680 and the drivers get installed when installing CUDA. On EC2 the card is a GRID K520, and the driver installed by CUDA doesn't work. I downloaded and installed the latest stable card specific driver and it then worked.
The discovery was made after running nvidia-smi and it wouldn't print any details about the card but rather would just show Killed. Run nvidia-smi again and it would lock up the console.
Unfortunately, I hadn't tested that CUDA app's were working but relied on the driver appearing to print a message in the log saying it was loaded and assumed it was working.
Updating the driver consisted of downloading the latest driver from nvidia (use the .run version). Then:
sudo modprobe -r nvidia_uvm
sudo modprobe -r nvidia
Finally install it with a command like:
sudo ./NVIDIA-Linux-x86_64-3xx.xx.xx.run
I then rebooted the instance and verified it with nvidia-smi
This link was insightful - CUDA 7.5 unstable on EC2

Problems running MPI (OpenMPI) app on Linux on ARM

I am trying to follow this tutorial for building and running an MPI application on an ARM based Ubuntu 11.10 system.
When installing open-mpi environment on my PC machine, the sample program runs well. However, trying the same on the ARM machine, the terminal hangs up and I need to kill the MPI process from a second terminal in order to release it.
The MPI packages I installed using apt-get, on both machines, were mpi-default-dev and mpi-default-bin, so I assume that the packages are as updated as they can be.
The first sample program in the tutorial makes every process prints a "hello" message with some info. On the PC I get messages from all 8 processes (although running on a single core) and then the program ends. On the ARM, I get no output at all. The program is just stuck immediately after launch.
Any idea on what's wrong? I am not sure even where to start to debug this?
Update: I tried removing the OpenMPI package and install the alternative MPICH2 package - but the result is just the same.
Ubuntu 11.10 did not ship with a functional Open MPI implementation for ARM (although it may have shipped with a nonfunctional one). Ubuntu 12.04 did.
I would recommend building your own Open MPI from source - available at http://www.open-mpi.org/software/ompi/v1.6/, unless you can update to a more recent version of Ubuntu.
Alternatively, you could rebuild the 11.10 package using the fixes pointed out in https://bugs.launchpad.net/ubuntu/+source/openmpi/+bug/949044.

Aptana Studio 3.3.0 crashes on start - Arch Linux

here's my system specs;
Arch Linux x86_64,
Kernel: 3.6.10-1-ARCH,
Gnome 3.6.2,
xf86-video-nouveau 1.0.4-1,
jdk7-openjdk 7.u9_2.3.3-1,
jre7-openjdk 7.u9_2.3.3-1,
jre7-openjdk-headless 7.u9_2.3.3-1,
lib32-libjpeg-turbo 1.2.1-1, libjpeg-turbo 1.2.1-1, libjpeg6-turbo 1.2.1-1
libpng12 1.2.50-2,
net-tools 1.60.20120804git-2,
unzip 6.0-6.
Ok, so there's the list of requirements that are installed, version numbers as well. Upon launch, the loading/splash screen won't even show, and then nothing... it just dies out. I attempted to launch it "aptana -v" and no output in the shell. I have looked for any error logs in ~/ , but nothing is there.
Other steps I've done is to delete any configuration folders/files for eclipse and aptana-secure in ~/. Also did a clean uninstall of just Aptana (not the dependencies), reinstall. Same result.
Any suggestions?
It appears that there is a mix up in the downloads. 32-bit and 64-bit got switched. If you are on 32-bit download 64-bit and vice versa. Hopefully it will be fixed soon.
See issues below:
Aptana 64 bit version crashes on startup on 64bit Linux OS with 64bit Oracle 7 java
Linux Installer Aptana_Studio_3_Setup_Linux_x86_3.3.0.zip contains x64 bit version of the project

qemu installation in ubuntu ? shows some error

I'm installed qemu in my ubuntu 12.04, in both ways [through source and from the software center in ubuntu] it shows same error. It does not pop up the qemu window. when i'm given a dummy filesystem,kernel,initrd, it simply shows some "VNC SERVER listening 127.0.0.1" screen and hangs no more response. Please give me the installation steps and needful libraries to run simple qemu for x86.
Try to include SDL support to QEMU and add option -sdl to run it. VNC is by default probably means you don't have SDL devel lib. Install libsdl-dev with apt.

Resources