Is IIS blocking calls to cuda from my web app? - iis

I have an asp.net mvc 4 x64 web app that in the background does some calculations and returns some numbers to be rendered in the browser. All works fine in visual studio but when called from the project folder from the browser via IIS I get a CudaErrorNoDevice. This is error number 38 and so it does look like it's referencing all the external cuda dlls correctly, making the call and returning the error.
For testing I'm using the GetDeviceProperties() method.
I even plugged the Gpu into the displays just in case the browser got confused that the cuda call was for graphics. No luck though.
Can anyone confirm that calling the Gpu from a web app is a perfectly do-able thing to do? And if so, is there any specific configuration needed in IIS for Gpu's.
Thanks
IIS 8 Express, VS2012, Cuda 5.0, Gtx Titan (This is a 2nd Gpu, Gtx 660 is for display).

It's possible that IIS is running at a service level that does not have access to the GPU (which is a WDDM device in this scenario.)
The usual suggestion would be to switch the GPU device to be in TCC mode (possible with most Quadro and Tesla GPUs), but that is not possible with a GeForce GPU (both of yours are GeForce GPUs).
As an alternative workaround, you may wish to try the method described here.
The statement about TCC support is a general one. Not all Quadro GPUs are supported. The final determinant of support for TCC (or not) on a particular GPU is the nvidia-smi tool. Nothing here should be construed as a guarantee of support for TCC on your particular GPU.

Related

HoloLens 2 Emulator visual updates extremely slow

I installed the latest version of the HoloLens 2 Emulator (10.0.20348.1501) on my Windows 10 Pro machine. I have 32GB of RAM, 11th Gen Intel 8 Core CPU, Nvidia 3080 (mobile) graphics card.
Initially I thought that the HoloLens emulator was super slow (an input such as trying to move the pointer can take 10, 20, 30 seconds to show up and sometimes doesn't even show up).
But upon testing some more, I've realized that my inputs are going through immediately (as I can tell from the sound feedback), it's just the visual feedback which is not updating. This testing is just inside the OS (without trying to launch an app I developed).
Any ideas what could be going on? In the performance monitoring tool, everything looks fine.
In the end, the only way to fix it, was to disable graphics switching in the BIOS, and set to Discrete only - despite the fact that the Nvidia GPU Activity shows that the GPU turns on when I launch the emulator.
If the emulator takes 10 seconds to update the graphic, there should be configurations issues. Based on my test, though I cannot say it works fluently in my PC, the HoloLens 2 emulator runs at around 15 fps. There is delay but should be work fine for testing. (I am running it with Nvidia 1080 (mobile), with a much older CPU than yours.)
Please check the document on Using the HoloLens Emulator - Mixed Reality | Microsoft Docs and make sure you have configured your computer properly.
In BIOS
Intel VT -> enabled
Intel VT-d -> disabled
Hardware-based Data Execution Prevention (DEP) (or any Intel data protection related feature, display name could be varied) -> disabled
In Windows
After BIOS configuration is done, completely shut down your PC, then boot. (Directly reboot may not apply changes).
Run dxdiag to check:
DirectX 11.0 or later (12.0 in my PC)
WDDM 2.5 graphics driver or later (3.0 in my PC)
Hyper-V Checking
Enable it if it is not. Reboot is required.
If it is already enabled. Disable it -> reboot the PC -> enable it again -> reboot
Others
For the laptop, make sure the power supply is plug-in and it is not in power-save mode. Check the GPU payload (around 36% in Nvidia 1080 mobile)
Then you may run the emulator again to see if this issue still exists.

Error: couldn't find RGB GLX visual or fbconfig | Prime on demand Nvidia card

I have a Lenovo G580 computer with intel CPU and a Nvidia 610M GPU. Running Linux Lite OS (Ubuntu based).
I would like to use Nvidia prime to run programs with the GPU.
I installed some packages about Nvidia drivers, version 390 according to this page.
With the Nvidia X Server Settings I can switch to on demand mode. On th UI there is only one settings for prime, no mention about the GPU settings.
My problem is that when the on demand mode is enable, many programs (games and glx debug programs) throw this error : (even without asking to use GPU)
Error: couldn't find RGB GLX visual or fbconfig
I know there is other posts like mine on internet however I can't understand the problem or identify a missing package on my computer. Have you already install prime on this GPU ? I can send logs or system info if needed.

Radeon developer panel not detecting running program

I have a vulkan application I want to profile (to find the bottlenecks on the gpu for optimizations). I am on linux and amd hardware so I downloaded the linux version of the radeon developer tools. I ran it and created a local server and that seems to work.
I then launched my program, but it does not appear on the list of profiling candidates in the panel.
As you can see the connection is fine (green dot), but no applications are detected. I have tried with advanced mode as well but no luck.
I know for a fact the program is running as I can see it and use it, recompile it... Has anyone run into this problem before?

CUDA performance penalty when running in Windows

I've noticed a big performance hit when I run my CUDA application in Windows 7 (versus Linux). I think I may know where the slowdown occurs: For whatever reason, the Windows Nvidia driver (version 331.65) does not immediately dispatch a CUDA kernel when invoked via the runtime API.
To illustrate the problem I profiled the mergeSort application (from the examples that ship with CUDA 5.5).
Consider first the kernel launch time when running in Linux:
Next, consider the launch time when running in Windows:
This post suggests the problem might have something to do with the windows driver batching the kernel launches. Is there anyway I can disable this batching?
I am running with a GTX 690 GPU, Windows 7, and version 331.65 of the Nvidia driver.
There is a fair amount of overhead in sending GPU hardware commands through the WDDM stack.
As you've discovered, this means that under WDDM (only) GPU commands can get "batched" to amortize this overhead. The batching process may (probably will) introduce some latency, which can be variable, depending on what else is going on.
The best solution under windows is to switch the operating mode of the GPU from WDDM to TCC, which can be done via the nvidia-smi command, but it is only supported on Tesla GPUs and certain members of the Quadro family of GPUs -- i.e. not GeForce. (It also has the side effect of preventing the device from being used as a windows accelerated display adapter, which might be relevant for a Quadro device or a few specific older Fermi Tesla GPUs.)
AFAIK there is no officially documented method to circumvent or affect the WDDM batching process in the driver, but unofficially I've heard , according to Greg#NV in this link the command to issue after the cuda kernel call is cudaEventQuery(0); which may/should cause the WDDM batch queue to "flush" to the GPU.
As Greg points out, extensive use of this mechanism will wipe out the amortization benefit, and may do more harm than good.
EDIT: moving forward to 2016, a newer recommendation for a "low-impact" flush of the WDDM command queue would be cudaStreamQuery(stream);
EDIT2: Using recent drivers on windows, you should be able to place Titan family GPUs in TCC mode, assuming you have some other GPU set up for primary display. The nvidia-smi tool will allow you to switch modes (using nvidia-smi --help for more info).
Additional info about the TCC driver model can be found in the windows install guide, including that it may reduce the latency of kernel launches.
The statement about TCC support is a general one. Not all Quadro GPUs are supported. The final determinant of support for TCC (or not) on a particular GPU is the nvidia-smi tool. Nothing here should be construed as a guarantee of support for TCC on your particular GPU.
Even it's been almost 3 years since the issue has been active, I still consider it necesssary to provide my findings.
I've been in the same situation: the same cuda programme elapsed for 5ms in Ubuntu cuda 8.0 while over 30ms in Windows 10 cuda 10.1. Both with GTX 1080Ti.
However, in Windows when I changed the compiler from VS Studio to cmd's nvcc compiler suddenly the programme was boosted to the same speed as the Linux one.
This suggests that maybe the problem comes from Visual Studio.

Does Intel HD Family fit min graphics requirement for BB 10 Alpha Simulator

I am trying to run BB10 Simulator to port my web app. The simulator runs okay until the point where I launch any apps on the simulator, then the app crashes and never loads. The fact that the simulator runs makes me think I fit the minumum requirement.
But after looking at my graphics card, I am not sure I do. Hence, the app crashes on the sim. Does Intel HD Family graphics cards fit the min graphics requirment of NVIDIA GeForce 8800 GT or higher or an ATI Radeon HD 2600 or higher?
The BB10 simulator doesn't really use GPU acceleration and I've successfully used it on a MacMini (i5 HD3000) and a Lenovo Laptop (i3 HD3000).
The simulator run the real OS, so unlike Ripple, if you do something forbidden (writing somewhere you’re not supposed to, accessing a resource you didn’t request – PIM, BBM ID, Internet) then the QNX kernel kills you. (Double-check the bar-descriptor.xml)
I’ve never used webworks, but it may be a good idea to install the Native SDK: Momentics (The C++/Cascades IDE) can be frightening, but there is a “QNX Device” perspective that can open a browser to the {simulator|real phone} file system and access to logs. You will have more detail explaining why your application was killed.

Resources