What are the programming languages for GPU - programming-languages

I read an article stating that GPU are the future of supercomputing. I would like to know what are the programming languages used for programming on GPU's

OpenCL is the open and cross platform solution and runs on both GPUs and CPUs.
Another is CUDA which is built by NVIDIA for their GPUs.
HLSL,Cg are few others

CUDA has quite a few language ports.. http://en.wikipedia.org/wiki/CUDA

Related

Programming Apple M1 chip GPU

I've used CUDA to program an Nvidia GPU but I
want to program my Apple M1 GPU. I can't find
any tools online to do this. CUDA is not for mac.
The Apple M1 GPU should be able to execute
25,000 threads simultaneously. Preferably this
would be a C-like language similar to CUDA.
The standard and portable way of computing on GPU is to use OpenCL.
However, using OpenCL on Apple machines has been deprecated in 2018. Since, Apple recommend developers to use Metal instead (which only works on Apple machines).

Which library is more universal, more flexible among on pthread, Intel TBB, openmp? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I want to learn multi-core programming in c++, could you recommend me some notes, please?
And What are the differences among pthread, Intel TBB and openmp? which library could use the Intel CPU more efficiently? Thanks.
These three are completely separate things. Pthreads is the native threading API of many modern Unix OSes that conform to the POSIX specification, including but not limited to Linux, OS X, FreeBSD and Solaris. There is also a badly supported and not very well working Pthreads implementation for Windows (Windows has its own native Win32 threading API). Pthreads is designed to support general threading scenarios and building parallel processing applications with it comes at a very high price (in number of code lines).
Intel TBB is a portable and open-source C++ template library for parallel data processing on shared memory architectures and implements tasks and task flows. This works especially well in conjunction with the new C++ lambdas (anonymous code blocks). The library could be built with many different compilers and for different architectures. On POSIX systems TBB builds upon Pthreads as the underlying threading API.
OpenMP is a directive-based extension to C/C++ and Fortran that supports data and task parallelism on shared memory architectures. It is not a library but rather a language extension and requires an OpenMP-enabled compiler. Virtually all modern C++ compilers - with the notable exception of Clang - support OpenMP, including the PGI, Oracle and MSVC++ compilers and GCC. On POSIX systems the OpenMP runtimes are built on top of Pthreads. Version 4.0 supports accelerator devices, e.g. GPUs and co-processors like Intel Xeon Phi.
Of all the three, Intel TBB is the most flexible while OpenMP is the easiest to learn. Pthreads is neither portable (e.g. badly supported on Windows), nor easy to learn. As already mentioned by Claudio, C++11 includes its own threading primitives, implemented on POSIX systems using Pthreads. But these are more like language abstractions around the threading API than parallel processing libraries/extensions like ITBB and OpenMP.
With the C++11 standard, C++ now has native support for multithreading, without the need of any external library. It is already available on Visual Studio 2012, Gcc 4.7+ and Clang.
For what concerns your additional questions:
pthread (which stands for "POSIX threads") is a multithread library for POSIX (i.e., Unix/Linux) systems. It does only multithreading since POSIX already supports multitasking (i.e., through the fork primitive). It works on all hardware platforms running a POSIX OS (e.g., Desktops, embedded Linux, and so on).
OpenMP is a portable library for multi-core programming in C/C++ and Fortran. The latest version is 4.0 which has support also for asynchronous calls.
Intel TBB supports Windows and Linux on Intel architectures. Running TBB on platforms different than Intel (e.g., ARM) is quite complicated (see here)

AMD 6 Core and compiling open-source projects

I have been wanting to build my own box with AMD 6 core. I have always used Intel based machines and frankly have not done open-source projects. I want to get into that along with Systems programming but am worried if open-source projects (mainly Linux based) are going to be a problem to compile on AMD?
How difficult is porting (if it is needed) from AMD to Intel and vice-versa. Thanks.
Both AMD and Intel processors use the x86 ISA. You don't generally compile for a specific processor, you compile for the ISA.
Unless you turn on very specific flags (such as -march) while compiling, a binary built on one processor will run on another.
To say it again, there is no problem.
This does not mean the processors are the same. They have different performance characteristics, support different motherboard chipsets, and have different feature sets (for instance, IOMMUs or other advanced virtualization features). But you won't usually be accessing things like the processor-internal performance registers in your everyday life, so don't worry about it and get whichever CPU is right for your desired system configuration and price/performance point.
It's unlikely that it'll be any more complex than compiling anywhere else. I'm pretty sure these kinds of very minor architectural differences are almost a non-issue these days.

How to directly access a GPU?

As most of you know CPUs are not well designed to do floating point calculation in contrast to GPUs. I am wondering how to use GPU's power without any abstraction layer or driver. Can I program for a GPU using assembly, C, C++ language (I mean how?). Although assembly seems to help me access the gpu directly, C/C++ are likely to need a medium library (e.g. OpenCL) to access the GPU.
Let me ask you another question: How much of a modern GPU's capability will be exposed to a programmer without any third-party driver?
The interfaces aren't documented so something like OpenCL is the only practical way to program the GPU directly.
Without a driver you would be stuck trying to reverse engineer the complete functionality of the GPU on your own.
Well, essentially, you would have to write a driver on either Windows or Linux. And the interfaces may be documented depending on which chipset you are trying to use. Intel has loads of PDF documentation on there website. However, this is a non trivial exercise at best and your code would only be able to used on that set of hardware. Meerly reading and understanding the documentation will take a bit of doing in most cases because "OOPs that's not how it really works" and how-tos do this or that aren't documented just the hardware and registers. However if REALLY want to do this your best bet would be to start with open source drivers on Linux for a particular chipset and tweek the to your SICK TWISTED purpose. All in all, other than for the learning aspect, it's prob a BAD idea.
The GPU manufacturer like NVDIA and ATI are closed source companies which has chosen not to disclose the GPU architecture and working abouts to the general public. This is why we cannot directly program the GPU as we can with the most CPU. The only way we can harness the power of the GPU for calculation is by using the provided library like CUDA in case of NVDIA. But there is a possible way where you can directly program a GPU for calculations but for that you need to reverse engineer and document all GPU and its registers and SYSTEMCALLS and you know that is not possible with our access to limited resources and limited time.
PS: The only other way is to sign in as a core developer for the GPU and sign a NDA (Non Disclosure Agreement) with the vendors which is likely not going to happen for starters and individuals like us.

Programming on future hardware?

I want to practice programming code for future hardware. What are these? The two main things that come to mind is 64bits and multicore. I also note that cache is important along and GPU have their own tech but right now i am not interested in any graphics programming.
What else should i know about?
-edit- i know a lot of these are in the present but pretty soon all cpus will be multicore and threading will be more important. I consider endians (big vs little) but found that not to be important and already have a big endian CPU to test on.
My recommendation for future :)
nVidia CUDA
nVidia Tegra
Or you can focusing on ray tracing.
If you'd like to dive into a "mainstream" OS that has full 64 bit support, I suggest you start coding against the beta of Mac OS X "Snow Leopard" (codename for 10.6). One of the big enhancements is Grand Central, which is a "facility" for developers to code for multicore systems. Grand Central should distribute workload not only between core, but also to the GPU.
Also very important is the explosion of smart devices such as the iPhone, Android, etc. I strongly believe that some upcoming so-called "netbooks" will rely on OS such as Android and iPhone OS, and as such knowing how to code against their SDK, and knowing how to optimize code for mobile devices is very important (e.g. optimizing performance graphic or otherwise, battery usage).
I can't foretell the future, but one aspect to look into is something like the CELL processor used in the PS3, where instead of many identical general purpose cores, there is only one (although capable of symmetric multithreading) plus many cores that are more specific purpose.
In a simple analysis, the Cell processor can be split into four components: external input and output structures, the main processor called the Power Processing Element (PPE) (a two-way simultaneous multithreaded Power ISA v.2.03 compliant core), eight fully-functional co-processors called the Synergistic Processing Elements, or SPEs, and a specialized high-bandwidth circular data bus connecting the PPE, input/output elements and the SPEs, called the Element Interconnect Bus or EIB.
CUDA and OpenCL are similar in that you separate your general purpose code and high performance computations into separate parts that may run on different hardware and language/api.
64 bits and multicore are the present not the future.
About the future:
Quantum computing or something like that?
How about learning OpenCL? It's a massively parallel processing language based on C. It's similar to nVidia's CUDA but is vendor agnostic. There are no major implementations yet, but expect to see some pretty soon.
As for 64 bit, don't really worry about. Programming will not really be any different unless you're doing really low level stuff (kernels). Higher level frameworks such as Java and .NET allow you to run code on 32 bit and 64 bit machines. Even C/C++ allows you to do this (but not quite so transparently).
I agree with Oli's answere (+1) and would add that in addition to 64-bit environments, you look at multi-core environments. The industry is getting pretty close to the end of the cycle of improvements in raw speed. But we're seeing more and more multi-core CPUs. So parallel or concurrent programming -- which is rilly rilly hard -- is quickly becoming very much in demand.
How can you prepare for this and practice it? I've been asking myself the same same question. So for, it seems to me like functional languages such as ML, Haskell, LISP, Arc, Scheme, etc. are a good place to begin, since truly functional languages are generally free of side effects and therefore very "parallelizable". Erlang is another interesting language.
Other interesting developments that I've found include
The Singularity Research OS
Transactional Memory and Software Isolated Processes
The many Software Engineering Podcast episodes on concurrency. (Here's the first one.)
This article from ACM Queue on "Real World Concurrency"
Of course this question is hard to answer because nobody knows what future hardware will look like (at least in long terms), but multi-threading/parallel programming are important and will be definitely even more important for some years.
I'd also suggest working with GPU computing like CUDA/Stream, but this could be a problem because it's very likely that this will change a lot the next years.

Resources