AMD 6 Core and compiling open-source projects

AMD 6 Core and compiling open-source projects - linux

I have been wanting to build my own box with AMD 6 core. I have always used Intel based machines and frankly have not done open-source projects. I want to get into that along with Systems programming but am worried if open-source projects (mainly Linux based) are going to be a problem to compile on AMD?
How difficult is porting (if it is needed) from AMD to Intel and vice-versa. Thanks.

Both AMD and Intel processors use the x86 ISA. You don't generally compile for a specific processor, you compile for the ISA.
Unless you turn on very specific flags (such as -march) while compiling, a binary built on one processor will run on another.
To say it again, there is no problem.
This does not mean the processors are the same. They have different performance characteristics, support different motherboard chipsets, and have different feature sets (for instance, IOMMUs or other advanced virtualization features). But you won't usually be accessing things like the processor-internal performance registers in your everyday life, so don't worry about it and get whichever CPU is right for your desired system configuration and price/performance point.

It's unlikely that it'll be any more complex than compiling anywhere else. I'm pretty sure these kinds of very minor architectural differences are almost a non-issue these days.

Related

Setting up an OpenCL environment across operating systems

We have a fairly large codebase worked on by two dozen programmers, across linux, windows and mac. We are adding some opencl code (initially targeting intel haswell) and are looking for ideas for how to do this with minimum disruption to the team. How can we arrange things so they can all happily compile against the correct opencl headers and libraries? In other words, what can we safely assume about their environments to allow us to set up the codebase to work out of the box for everyone?

There are quite a few differences between OpenCL installations on different OSs. The best way to handle it is to start with a cross-platform build system - I personally use CMake. The newest versions of CMake include a module for finding the headers and libraries, so in theory you don't need to do anything special. However, in my experience, the module is not written well enough to support all possible SDKs and OSs. I've had to add more locations to look for the relevant files.
Once you have the headers and libraries located, you'll need to include the opencl.lib to your build, which you should be able to do in a platform-independed manner through your build system of choice.
The final part is how to include the headers in your code. Basically, there is a difference between where the cl.h is on Windows and Linux vs OS X. If you look at any OpenCL example, you will see what I mean. You'll need to have some #ifdefs around that, which I recommend you isolate to your own include file.
You will also need to decide how to handle your kernels. For any decent size kernel, you will want to keep the source in a separate file. I keep my kernels in .cl files and then I use the STRINGIFY macro to include the source directly in my CPP code into a string variable.
I hope this gives you enough to start with.

There can be many things to do but I am writing down a standard one:
Create a offline standard portable intermediate representation(SPIR) of opencl kernels across different OS and distribute. The main goal of SPIR was to enable application developers to avoid shipping their kernels in a source form, while maintaining portability between vendors and devices.
Refer this for more detail:
https://software.intel.com/en-us/articles/using-spir-for-fun-and-profit-with-intel-opencl-code-builder

Does developing applications for SPARC, IBM power CPU require separate compilers, other than x86, x86-64 targets?

Does developing applications for SPARC, IBM PowerPC require separate compliers, other than x86 and x86-64 targets?
If true, how easily could x86, x64 binaries in Linux be ported to SPARC and PowerPC? Is there a way to simulate these environments using virtualization?

First answer is, yes, to develop compiled code for Power Architecture or SPARC you need compilers that will generate code for those processors. A compiler that generates x86 or x86_64 code will not generate code that runs on Power Architecture or SPARC. You might find cross compilers running on x86 (32 or 64) that will generate Power or SPARC code, though. But the other thing to be aware of is the object file format (elf, xcoff, and so on). Instruction set is just part of the picture. You might get clearer answers if your provide more details of your particular starting point and goals.
Second, one normally doesn't talk of porting binaries. We port source code, which may include assembly language as well as C or other languages. The process for doing this includes compiler selection, after which you can begin an iterative process of compiling, porting, compiling, and linking the code for the new hardware. I'm omitting many details. Again, if you provide more specifics in your question, you might get more specific answers.
Third, as others have said, no, you can't use virtualization in the scenarios you allude to. You might find acceptable emulation solutions. Again, please provide more specifics if you can.

No, virtualization is not the answer. Virtualization takes your hardware platform and creates an independent "virtual" machine of the same hardware. So when running on x86, you use virtualization to create a second x86 machine.
To simulate a completely different hardware architecture, you would want to look into emulation.
How easy / hard it is to port software from one architecture to another architecture depends completely on how the software was written. If it uses something particular to one architecture but not the other (for example, x86 can handle non-aligned memory accesses while SPARC does not) you are going to need to fix things like that. Another example that could make it difficult to port would be if the software has assumed a specific endian-ess of the hardware.

SPARC, IBM PowerPC require separate
compliers, other than x86 and x86-64
targets?
I hate to be really snippy, but given that IBM PowerPC and SPARC do not support the x86 or x86-64 command sets (i.e. talk totally separate machine langauge), where did you even get the idea they would be compatible?
Is there a way to simulate these
environments using virtualization?
Possibly yes, but it would be REALLY slow, because you would have to either translate the machine code, or - well - interpret it. Hardware virtualiaztion would not work, given that the CPU architectures are different. SPARC and PowerPC are not just "different labels for the same thing", they are really different internally.

Use Java or LLVM, or try QEMU to test other CPUs.
It's easy if your code was written to be portable, it's not if it wasn't. Varying sizes of data types per platform and code that depends on it, inline assembly, etc. will make it harder.
Home page for LLVM and QEMU:
http://llvm.org/
http://wiki.qemu.org/Main_Page

Programming on future hardware?

I want to practice programming code for future hardware. What are these? The two main things that come to mind is 64bits and multicore. I also note that cache is important along and GPU have their own tech but right now i am not interested in any graphics programming.
What else should i know about?
-edit- i know a lot of these are in the present but pretty soon all cpus will be multicore and threading will be more important. I consider endians (big vs little) but found that not to be important and already have a big endian CPU to test on.

My recommendation for future :)
nVidia CUDA
nVidia Tegra
Or you can focusing on ray tracing.

If you'd like to dive into a "mainstream" OS that has full 64 bit support, I suggest you start coding against the beta of Mac OS X "Snow Leopard" (codename for 10.6). One of the big enhancements is Grand Central, which is a "facility" for developers to code for multicore systems. Grand Central should distribute workload not only between core, but also to the GPU.
Also very important is the explosion of smart devices such as the iPhone, Android, etc. I strongly believe that some upcoming so-called "netbooks" will rely on OS such as Android and iPhone OS, and as such knowing how to code against their SDK, and knowing how to optimize code for mobile devices is very important (e.g. optimizing performance graphic or otherwise, battery usage).

I can't foretell the future, but one aspect to look into is something like the CELL processor used in the PS3, where instead of many identical general purpose cores, there is only one (although capable of symmetric multithreading) plus many cores that are more specific purpose.
In a simple analysis, the Cell processor can be split into four components: external input and output structures, the main processor called the Power Processing Element (PPE) (a two-way simultaneous multithreaded Power ISA v.2.03 compliant core), eight fully-functional co-processors called the Synergistic Processing Elements, or SPEs, and a specialized high-bandwidth circular data bus connecting the PPE, input/output elements and the SPEs, called the Element Interconnect Bus or EIB.
CUDA and OpenCL are similar in that you separate your general purpose code and high performance computations into separate parts that may run on different hardware and language/api.

64 bits and multicore are the present not the future.
About the future:
Quantum computing or something like that?

How about learning OpenCL? It's a massively parallel processing language based on C. It's similar to nVidia's CUDA but is vendor agnostic. There are no major implementations yet, but expect to see some pretty soon.
As for 64 bit, don't really worry about. Programming will not really be any different unless you're doing really low level stuff (kernels). Higher level frameworks such as Java and .NET allow you to run code on 32 bit and 64 bit machines. Even C/C++ allows you to do this (but not quite so transparently).

I agree with Oli's answere (+1) and would add that in addition to 64-bit environments, you look at multi-core environments. The industry is getting pretty close to the end of the cycle of improvements in raw speed. But we're seeing more and more multi-core CPUs. So parallel or concurrent programming -- which is rilly rilly hard -- is quickly becoming very much in demand.
How can you prepare for this and practice it? I've been asking myself the same same question. So for, it seems to me like functional languages such as ML, Haskell, LISP, Arc, Scheme, etc. are a good place to begin, since truly functional languages are generally free of side effects and therefore very "parallelizable". Erlang is another interesting language.
Other interesting developments that I've found include
The Singularity Research OS
Transactional Memory and Software Isolated Processes
The many Software Engineering Podcast episodes on concurrency. (Here's the first one.)
This article from ACM Queue on "Real World Concurrency"

Of course this question is hard to answer because nobody knows what future hardware will look like (at least in long terms), but multi-threading/parallel programming are important and will be definitely even more important for some years.
I'd also suggest working with GPU computing like CUDA/Stream, but this could be a problem because it's very likely that this will change a lot the next years.

Is assembler portable between Linux distros?

Is a program shipped in assembler format portable between Linux distributions (modulo CPU architecture differences)?
Here's the background to my question: I'm working on a new programming language (named Aklo), whose modus operandi will be the classic compiling to .s and feeding the result to the GNU assembler.
Obviously it would be nice ultimately to have the implementation written in itself, but I had resigned myself to maintaining it in C++ to solve the chicken and egg problem: suppose you download the compiler for the first time and it is itself written in Aklo, how do you compile it? As I understand it, different Linux distributions and other UNIX like systems have different conventions for binary formats.
But it's just occurred to me, a solution might be to ship the .s file (well, one per CPU architecture): it's fair to assume you have or can install the GNU assembler. Of course I'd still need a bootstrap compiler, but that doesn't need to be fast; I can write it in Python.
Is assembler portable in the way that binaries are not? Are there any other stumbling blocks I haven't thought of?
Added in response to one answer:
I had looked wistfully at LLVM, there is certainly a lot of good stuff there and it would make my life easier -- except that it would incur a dependency on the correct version of LLVM being installed. It wouldn't be so bad having that dependency on development machines, but in a world where it's common to ship programs as source, the same dependency would be incurred for every user of every program ever written in Aklo, and I decided that was too high a price to pay.
But if the solution of shipping compiled programs as assembler works... then that solves that problem, and I can use LLVM after all, which would be a big win.
So the question about portability of assembler is even considerably more important than I had first realized.
Conclusion: from answers here and on the LLVM mailing list http://lists.cs.uiuc.edu/pipermail/llvmdev/2010-January/028991.html it seems the bad news is the problem is unsolvable, but the good news is that means using LLVM makes it no worse, so I'm free to do so and obtain all the advantages thereof.

You might want to check out LLVM before going down this particular path. It might make your life a lot easier, as it provides a low level virtual machine that makes a lot of hard stuff just work and has been very popular.

At a very high level, the ABI consists of { instruction set, system calls, binary format, libraries }.
Distribution as .s may free you from the binary format. This is still rather pointless, because you are fixed to a particular ISA and still need to use libraries and/or make system calls. Libraries vary from distribution to distribution (although this isn't really that bad, especially if you just use libc) and syscalls vary from OS to OS.

It's basically 20 years since I last bootstrapped a C compiler. At the level of compilers, the differences between Linux distributions are minimal.
The much more important reason for going LLVM is cross-platform; if you're not writing some intermediate language, your compiler will be extremely difficult to retarget for different processors. And seeing as, on my laptop, I have compilers for x86, x86_64, two kinds of MIPS, PowerPC, ARM and AVR... you see where I'm going? I can compile multiple languages for most of those targets too (only C for AVR).

Machine dependent languages

Why might a machine-dependent language be more appropriate for writing certain types of programs? What types of programs would be appropriate?

Why might a machine-dependent language
be more appropriate for writing
certain types of programs?
Speed
Some machines have special instructions sets (Like MMX or SSE on x86, for example) that allows to 'exploit' the architecture in ways that compilers may or may not utilize best (or not utilize at all). If speed is critical (such as video games or data-crunching programs), then you'd want to utilize the best out of the architecture you're on.
Where Portability is Useless
When coding a program for a specific device (take the iPhone or the Nintendo DS as examples), portability is the least of your concerns. This code will most likely never go to another platform as it's specifically designed for that architecture/hardware combination.
Developer Ignorance and/or Market Demand
Computer video games are prime example - Windows is the dominating computer game OS, so why target others? It will let the developers focus on known variables for speed/size/ease-of-use. Some developers are ignorant - they learn to code only on one platform (Such as .NET) and 'forget' that others platforms exist because they don't know about them. They seem to take an approach similar to "It works on my machine, why should I bother porting it to a bizarre combination that I will never use?"
No other choice.
I will take the iPhone again as it is a very good example. While you can program to it in C or C++, you cannot access any of the UI widgets that are linked against the Objective-C runtime. You have no other choice but to code in Objective-C if you want to access any of those widgets.
What types of programs would be
appropriate?
Embedded systems
All of the above apply - When you're coding for an embedded system, you want to take advantage of the full potential of the hardware you're working on. Be it memory management (Such as the CP15 on ARM9) or even obscure hardware that is only attached to the target device (servo motors, special sensors etc).

The best example I can think of is for small embedded devices. When you have to have full control over every detail of optimization due to extremely limited computing power (only a few kilobytes of RAM, for example), you might want to drop down to the assembler level yourself to make everything work perfectly in those small confines.
On the other hand, compilers have gotten sophisticated enough these days where you really don't need to drop below C for most situations, including embedded devices and microcontrollers. The situations are pretty rare when this is necessary.

Consider virtually any graphics engine. Since your run-of-the-mill general purpose CPU cannot perform operations in parallel, you would have a bare minimum of one cycle per pixel to be modified.
However, since modern GPUs can operate on many pixels (or other piece of data) all at the same time, the same operation can be finished much more quickly. GPUs are very well-suited for embarrassingly parallel problems.
Granted, we have high-level-language APIs to control our video cards nowadays, but as you get "closer to the metal", the raw language used to control a GPU is a different animal from the language to control a general purpose CPU, due to the vast difference in architectures.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string