Malloc inside Kernel - malloc

I am trying to compile a code that has a malloc function inside the kernel
and i get this error:
Error 5 error : calling a host function("malloc") from a __device__/__global__ function("bitapS") is not allowed C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.0\C\src\str_bit\main.cu 36 1 str_bit
My command line is:
Error 6 error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.0\bin\nvcc.exe" -gencode=arch=compute_10,code=\"sm_10,compute_10\" -gencode=arch=compute_20,code=\"sm_20,compute_20\" --use-local-env --cl-version 2010 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin\x86_amd64" -I"../../common/inc" -I"../../../shared/inc" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.0\include" -G0 --keep-dir "x64\Debug" -maxrregcount=0 --machine 64 --compile -D_NEXUS_DEBUG -g -Xcompiler "/EHsc /nologo /Od /Zi /MTd " -o "x64/Debug/main.cu.obj" "C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.0\C\src\str_bit\main.cu"" exited with code 2. C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\BuildCustomizations\CUDA 4.0.targets 357 10 str_bit
Any suggestions? I thought that with sm_20 enabled you could allocate... my card is a 460 GTX
Thanks!

It's true you should not do it but if they enabled it probably has some uses.
The code gives an error because you are compiling for architecture 1.0 and 2.0. To make it compile you can remove
-gencode=arch=compute_10,code=\"sm_10,compute_10\"
from the command line if you only intend to run the code on fermi devices or you must provide an alternative code in your source code for older devices. You can do it by using the NVCC preprocessor macro:
__CUDA_ARCH__
like this:
#if (__CUDA_ARCH__ < 200)
/* code for 1.x arch */
#else
/* code for 2.x arch */
#endif
It seems you are using Visual Studio so in the project properties you can go to the cuda section and specify there the architectures you wan't to build for.

I found it....
You have to specify
sm_20,compute_20
also to your file properties not only in the project attributes!
Thanks anyway!

You should not be allocating memory inside the kernel. Ever. This is a clear sign your CUDA kernel is poorly designed and will have bad performance.

Related

VSCode linux build multiple cpu's g++

Is it possible to use VSCode, linux, g++ to compile program using multiple CPU cores to speed up build? I mean it can compile multiple .cpp files at once to speed it up as Visual Studio does.

Why does ARM's cl.exe try to build x86 or x64 app from ARM Developer Command Line?

I'm working on a basic makefile file to test compiling our sources for alternate Microsoft environments, like Windows Phone. I opened an VS2012 ARM Developer Command Prompt and ran nmake on the makefile. It resulted in:
nmake /f makefile.namke
...
cl /c cryptlib.cpp cpu.cpp...
cryptlib.cpp
C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\INCLUDE\crtdefs.h(338):
fatal error C1189: #error: Compiling Desktop applications for the ARM platform is not supported.
cpu.cpp
C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\INCLUDE\crtdefs.h(338):
fatal error C1189: #error: Compiling Desktop applications for the ARM platform is not supported.
...
NMAKE : fatal error U1077: '"C:\Program Files (x86)\Microsoft Visual Studio 11.0
\VC\BIN\x86_ARM\cl.EXE"' : return code '0x2'
Stop.
"Desktop applications" is kind of ambiguous, so I searched Microsoft for the meaning of the term. It appears that means the toolchain is building x86 or x64 Metro UI-based application.
I feel like I'm suffering a disconnect, or Microsoft is suffering a disconnect and their tools are buggy.
Why is Microsoft's ARM version of cl.exe trying to build an x86 or x64 application instead of compiling for ARM? Or why is the VS2012 ARM Developer Command Prompt setting up for a x86 or x64 application?
I also tried remediating the problem, but the proposed solutions are not working. So now I am trying to understand what's going on at the highest levels.
For example, one answer says to add <WindowsSDKDesktopARMSupport>true</WindowsSDKDesktopARMSupport> to an ARM property sheet, but that did not work. Another answer says to add CXXFLAGS = /D _ARM_WINAPI_PARTITION_DESKTOP_SDK_AVAILABLE, but that did not work either.
The makefile is about as simple as it gets to test the compile under Microsoft's ARM toolchain.
LIB_SRCS = cryptlib.cpp cpu.cpp ...
LIB_OBJS = cryptlib.obj cpu.obj ...
TEST_SRCS = test.cpp bench1.cpp bench2.cpp ...
TEST_OBJS = test.obj bench1.obj bench2.obj ...
CXX = cl.exe /nologo
AR = lib.exe
CXXFLAGS =
all: cryptest.exe
cryptest.exe: $(TEST_OBJS) cryplib.lib
$(CXX) $(CXXFLAGS) /ref:cryplib.lib /out:$# $(TEST_SRCS)
cryplib.lib : $(LIB_OBJS)
$(CXX) $(CXXFLAGS) $(LIB_SRCS)
$(AR) $(LIB_OBJS)
The correct solution would be to add something similar to /D WINAPI_FAMILY=WINAPI_FAMILY_APP or /D WINAPI_FAMILY=WINAPI_FAMILY_PHONE_APP.
The Windows SDK headers have three different API subset targets, "desktop", "app" (the new "metro" style app introduced in windows 8) and "phone". When building from within Visual Studio, the right subset is chosen automatically depending on the project type, but when building manually by invoking cl.exe, you need to manually specify the target. (The chosen target limits what declarations are visible in the headers, hiding the ones that are unavailable in the more limited ones.)
Out of tradition, desktop is the default, but when targeting ARM, you need to choose one of the other ones, since there's no publicly supported target for third party code in desktop mode for ARM. (The WinRT tablets did have a desktop mode, but third parties weren't allowed to build apps for it.)
Since Windows 10 and MSVC 2015, you don't need to distinguish between phone and app, but should use "app" for both.
The other flag that was suggested, _ARM_WINAPI_PARTITION_DESKTOP_SDK_AVAILABLE, basically tells the headers to pretend like you are allowed to build for the desktop API family even for ARM. To use it, it seems like you should define it like this: /D _ARM_WINAPI_PARTITION_DESKTOP_SDK_AVAILABLE=1 (i.e. just defining it isn't enough, you should define it to 1). For plain code that doesn't use much of the windows API, this should also work equally well, but the better solution is to declare the real API target, to get proper warnings when trying to use APIs that aren't available.

nvcc.exe linking error Microsoft Visual Studio configuration file 'vcvars64.bat' could not found

I want to use nvcc -ptx from windows command line, but I always get this error message:
nvcc : fatal error : Microsoft Visual Studio configuration file 'vcvars64.bat' could not be found for installation at 'C:\Program Files (x86)\Microsoft Visual S
tudio 11.0\VC\bin/../..'
I'm using vs 2012 express edition. What can be the solution?
I have managed to solve the issue and make work with MS Visual Studio Express 2012, here what I did:
Installed MS Visual Studio 2012 Express
Installed cuda_5.5.20_winvista_win7_win8_general_64, the latest version as of 2014-01-16
From this directory: C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin, I have copied x86_amd64 to amd64
In the new directory: C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\amd64, I have created a file vcvars64.bat
In the file vcvars64.bat, just added: CALL setenv /x64
The compilation worked great:
C:\CUDA>nvcc -o square square.cu
Creating library square.lib and object square.exp
C:\CUDA>square.exe
0.000000 1.000000 4.000000 9.000000
16.000000 25.000000 36.000000 49.000000
64.000000 81.000000 100.000000 121.000000
144.000000 169.000000 196.000000 225.000000
From NVIDIA CUDA Compiler Driver document
1.2. Supported Host Compilers
nvcc uses the following compilers for host code compilation:
On Linux platforms
The GNU compiler, gcc, and arm-linux-gnueabihf-g++ for cross compilation to the ARMv7
architecture
On Windows platforms
The Microsoft Visual Studio compiler, cl On both platforms, the compiler found on the current
execution search path will be used, unless nvcc option -compiler-bindir is specified (see File and Path Specifications).
Your visual studio install is asking for .NET v3.5 framework:
http://www.microsoft.com/en-us/download/details.aspx?id=21
Got this info from this: Where can I find Microsoft.Build.Utilities.v3.5
When in your project go to Configuration Properties > CUDA C/C++ > Device and change Code Generation to the following: compute_11,sm_11

How to enable native 64-bit compiler in Visual Studio?

Below is a simplest C++ program:
x64test.cpp
int main()
{
char * p = new char[0xffffffffff];
}
My intention is to allocate a big buffer greater than 4G. In a native 64-bit process, it should be OK; but Visual Studio 2011 Beta rejects to compile x64test.cpp and resports: "error C2148: total size of array must not exceed 0x7fffffff bytes".
I have googled and found a useful article at http://blogs.msdn.com/b/windowssdk/archive/2007/09/08/updated-windows-sdk-visual-c-cross-compilers.aspx
According to the article, I should use a native 64-bit compiler to compile x64test.cpp. However, Visual Studio can only be launched as a 32-bit process so that msbuild.exe and cl.exe are always running as 32-bit processes.
I have tried to configure the solution platform to x64, but no effect.
I have used the so-called native 64-bit compiler to successfully compile x64test.cpp by the following steps:
1, start cmd.exe as an administrator;
2, cd C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\amd64;
3, cl x64test.cpp
My question is:
Is there a way to enable the native 64-bit compiler in Visual Studio IDE?

How to get a 64 bit dll with c source file, def file, link file by using command line in vc 6.0

My compile environment is windows xp and vc 6.0.
Now I have a c source file(msgRout.c), def file(msgRout.def), link file(msgRout.link), then I use commands below to get a 32 bit dll:
1.cl /I ../include -c -W3 -Gs- -Z7 -Od -nologo -LD -D_X86_=1 -DWIN32 -D_WIN32 -D_MT -D_DLL msgRout.c
2.lib -out:msgRout.lib -def:msgRout.def -machine:i386
3.link /LIBPATH:../../Lib -nod -nologo -debug:full -dll #msgRout.link -out:msgRout.dll
But the dll I got cannot be loaded on X64 application. it required a 64 bit dll.
So here is my question:
Can I get a 64 bit dll with vc 6.0?
Using only above 3 commands alike, how can I get 64 bit dll?
Many GREAT THANKS!!!
Allan
Visual C++ 6.0 does not include 64-bit compiler/libraries. You will need either a more recent version of Visual C++ or a Windows Platform SDK that has the 64-bit support. The earliest one is the Windows Server 2003 Platform SDK.
Once you have that installed, cl /? and link /? will have info on how to build 64-bit apps.
Update: If you have VS2005, you can build 64-bit binaries with the x86-amd64 cross-compiler (a 32-bit cl.exe that produces 64-bit code) or with the x64 compiler (a 64-bit cl.exe). To do that, you need to:
Make sure you've installed the 64-bit tools support during VS installation.
Open a command line and set it for x86-amd64 builds using C:\Program Files\Microsoft Visual Studio 8\VC\Vcvarsall.bat x86_amd64 or
(on 64-bit Windows) Open an x64 command line and set it for 64-bit builds using C:\Program Files\Microsoft Visual Studio 8\VC\Vcvarsall.bat amd64.
Once you do that, you should be able to use the same command line as above (with tcouple small changes - for cl you'll have to define /D:X64=1 or /D_AMD64_ and for link you'll have to change the /machine:x86 to /machine:x64) to produce 64-bit version of your program.
Here are some links with more information:
Installing Visual Studio 64-bit Components
HowTo: Enable a 64-Bit Visual C++ Toolset at the Command Line
Use Visual Studio to build 64-bit application
64-bit Applications
Seven Steps of Migrating a Program to a 64-bit System
You cannot. Microsoft does not have time machines.

Resources