When writing code compiled by LLVM backend, does architecture matter? - linux

My question is actually more general than the title:
At what point does the architecture matter when writing code that will eventually be compiled to LLVM intermediary code, and then from there to the machine language?
Let's say I'm writing Rust (which uses LLVM as a backend). Am I automatically capable of compiling my Rust code to every architecture that LLVM can target (assuming there's an OS on that machine that can run it)?
Or could it be that the Rust standard library hasn't been made "ARM compatible" yet, so I couldn't compile to ARM even if the LLVM targets it?
What if I don't use any of the standard library, my entire program is just a program that returns right away? Could it be the case that even without any libraries, Rust (or what have you) can't compile to ARM (or what have you) even if the LLVM targets it?
If all the above examples compile just fine, what do I have to do to get my code to break on one architecture not compile to a certain architecture?
Bonus question of the same variety:
Let's say the standard library makes use of OS system calls (which is surely does). Do you have to care about architecture when making system calls? Or does the OS (Linux, for example) abstract away architecture as well?
Thanks.

TL;DR
From my understanding you can compile to any target LLVM supports (there may still be a few caveats here with frontends using inline assembler or module level inline assembly), however, you are not guaranteed it will actually execute correctly. The frontend is responsible for doing the work to be portable across the platforms the author supports.
Note also that as a frontend developer you are responsible for providing the data layout and target triple.
see also:
llvm-bitcode-cross-platform
llvm
FAQ
Implementing Portable
sizeof
Cross Compile with Clang
Your Questions:
Let's say I'm writing Rust (which uses LLVM as a backend). Am I
automatically capable of compiling my Rust code to every architecture
that LLVM can target (assuming there's an OS on that machine that can
run it)?
This is dependent on the authors of the Rust frontend.
Or could it be that the Rust standard library hasn't been made "ARM
compatible" yet, so I couldn't compile to ARM even if the LLVM targets
it?
I'm pretty sure LLVM would be able to emit the instructions, but it may not be correct in terms of addressing.
I have not used the inline assembler facilities mentioned above myself, but I assume if it allows platform specific assembly then this would break platform agnostic compilation as well.
What if I don't use any of the standard library, my entire program is
just a program that returns right away? Could it be the case that even
without any libraries, Rust (or what have you) can't compile to ARM
(or what have you) even if the LLVM targets it?
This again depends on what the Rust frontend emits. There may be some boilerplate setup logic it emits even before it emits instructions for your logic.
I'm writing my own language in LLVM that does this in the case of a special function called "main". I am targeting the C ABI so it will wrap this main with a proper C style main and invoke it with a stricter set of parameters.
If all the above examples compile just fine, what do I have to do to
get my code to break on one architecture not compile to a certain
architecture?
Consider C/C++ with Clang as mentioned in the llvm FAQ. Clang is a frontend, probably the most popular, for LLVM and the users writing C/C++ are responsible for #include-ing the appropriate platform specific functionality.
Some languages may be designed more platform independent and the frontend could then handle the work for you.
Let's say the standard library makes use of OS system calls (which is
surely does). Do you have to care about architecture when making
system calls? Or does the OS (Linux, for example) abstract away
architecture as well?
I'm assuming you are talking about the case where the frontend targets the C standard library in which case LLVM has standard C library intrinsics which could be used by the frontend. This is not the only way, however, as you can use the call instruction to invoke C functions directly if targeting the C ABI as in the Kaleidoscope example.
In the end the standard library can be a portability issue and must be addressed by the frontend developers.

Related

how does rust compiler handle manufacturer specified instructions for riscv?

As we know, riscv allow any manufacturer to add their custom instructions for their products, this is especially common in embedded cpu. And also, the manufacturers often provides the user with their modified version of GCC to compile code for there chips.
But how about the rust compiler? It seems that seldom of manufacturer will provide a modified rust compiler for there chips.
Will this be a huge disadvantage for rust when use rust in embedded or low level kernel programming? And how to solve this problem?
This is one of the reasons llvm was invented, instead of having to implement a compiler for every language-architecture pair one has only to implement one frontend for every language and one backend for every architecture, I expect manufactures more and more to shift from providing a custom gcc to provide a custom llvm backend at which point rust will support that target since it builds upon llvm.

Android NDK: Providing library variants for the same abi

I'm looking for the best way to develop and package different variants of a library with different compile settings but for the same ABI and then selecting the best fit at runtime. In more concrete terms, I'd like a NEON and non-NEON armeabi-v7a build.
The native library has a public C interface that third parties link to. They seem to need to link to one of the variants to prevent link errors, but I'd like to load the alternative variant at runtime if it's a better fit for the device, and have the runtime loader do the correct relocations.
From what I see so far it seems I need to give both variants the same file name, so need to put them in different folders. Subfolders under the abi folder don't seem to get copied by the package installation process so that approach doesn't work. The best suggestion I've seen so far is to manually copy one variant from the res folder to a known device path and to call System.loadLibrary() with a full path. Reference: https://groups.google.com/forum/#!topic/android-ndk/zu_dmcmUlMo
Is this still the best/recommended approach?
How will this interact with the binary translation done on non-arm devices? (Although I can supply an x86 build, some third parties may leave it out of their apk).
I'm assuming cpufeatures on a device using binary translation will not report the cpu family as ARM, so my proposed solution would be to build a standard armeabi-v7a library in the normal way (which I guess will get binary translated), and ship a NEON-supporting library in res/raw. Then at runtime if cpufeatures reports an ARM CPU with NEON support then copy out that library and call loadLibrary with the full path. Can anyone see any problems with that approach?
If you explicitly want to have two different builds of a lib, then yes, it's probably the best compromise.
First off - do note that many libraries that can use NEON can be built with those parts runtime-enabled so that you can have a normal ARMv7 build which doesn't strictly require NEON but can enable those codepaths at runtime if detected - e.g. libav/FFmpeg do that, and the same goes for many other similar libraries. This allows you to have one single ARMv7 binary that fully utilizes NEON where applicable, while still works on the few ARMv7 devices without NEON.
If you're trying to use compiler autovectorization, or if this is a library where the NEON routines aren't easily confined to restricted parts that are enabled at runtime (or hoping to gain extra performance by building the whole library with NEON enabled), your approach sounds sane.
Keep in mind that you want to have at least one native library that is packaged "normally" (which you seem to have, but which has been an issue in e.g. https://stackoverflow.com/a/29329413/3115956). On installation, the installer picks the best match of the bundled architectures and only extracts the libs from that one, and runs the process in that mode. On devices with multiple ABIs (32 and 64 bit), this is essential since if the process is started in a different mode it's too late to switch mode once you try to load a library in a different form.
On an x86 device that emulates ARM binaries, at least the cpufeatures library will return ARM if the process is running in ARM mode. If you use system properties to find the primary and secondary ABIs, you won't know which of them the current process is using though.
EDIT: x86 devices with binary translation actually seem to be able to load an armeabi library even if the same process already has loaded some bundled x86 libraries as well. So apparently this translation is done on a per library basis, not like 32 vs 64 bit, where a certain mode is chosen for the process at startup, which excludes loading any libraries of the other variant.

Choosing a compact c/c++ compiler for ARM based Embedded Linux System

I am working on ARM cortex A7 based embedded system that runs Linux. I am looking for c/c++ compiler (as GCC is around 100 mb) which is compact in size and reliable. I have shortlisted some as SDCC, TCC, OTCC, Digital Mars, NWCC, LCC, Small C, portable C compiler.
I want to know if compilers are dependent on operating system or hardware and how should I proceed to start strip down the list. I am not an expert and I am learning about Linux systems and embedded environment. If you think I am asking wrong question or going in wrong direction, Kindly let me know.
Thanks you
Note
I already have cross compiler on my linux (laptop) system. I compile program to be loaded using this only. But the embedded system is supposed to be able to load with a particular language designed by us, I am hoping to convert that language in to equivalent C code and run it. I tried writing my own interpreter in c that accepts the code in other language and parse it and executes but it's little slow, I tried same instructions in (directly written in) C with satisfactory results.
Edit:
I ended up using g++ on my system to compile code, as main function of system was to use generated code.
Generally, when dealing with embedded systems you are better off cross-compiling and sending the binaries than compiling directly on the device. Even though it may consume you some time setting up the toolchain on the beginning, it definitely pays you back with build time.
There are several pre-built Linaro GCC which are cross-compilers having (generally) x86 linux as host and arm linux as target platforms. This way, you should not worry about compiler size.

Bare metal cross compilers input

What are the input limitations of a bare metal cross compiler...as in does it not compile programs with pointers or mallocs......or anything that would require more than the underlying hardware....also how can 1 find these limitations..
I also wanted to ask...I built a cross compiler for target mips..i need to create a mips executable using this cross compiler...but i am not able to find where the executable is...as in there is 1 executable which i found mipsel-linux-cpp which is supposed to compile,assemble and link and then produce a.out but it is not doing so...
However the ./cc1 gives a mips assembly.......
There is an install folder which has a gcc executable which uses i386 assembly and then gives an exe...i dont understand how can the gcc exe give i386 and not mips assembly when i have specified target as mips....
please help im really not able to understand what is happ...
I followed the foll steps..
1. Installed binutils 2.19
2. configured gcc for mips..(g++,core)
I would suggest that you should have started two separate questions.
The GNU toolchain does not have any OS dependencies, but the GNU library does. Most bare-metal cross builds of GCC use the Newlib C library which provides a set of syscall stubs that you must map to your target yourself. These stubs include low-level calls necessary to implement stream I/O and heap management. They can be very simple or very complex depending on your needs. If the only I/O support is to a UART to stdin/stdout/stderr, then it is simple. You don't have to implement everything, but if you do not implement teh I/O stubs, you won't be able to use printf() for example. You must implement the sbrk()/sbrk_r() syscall is you want malloc() to work.
The GNU C++ library will work correctly with Newlib as its underlying library. If you use C++, the C runtime start-up (usually crt0.s) must include the static initialiser loop to invoke the constructors of any static objects that your code may include. The run-time start-up must also of course initialise the processor, clocks, SDRAM controller, timers, MMU etc; that is your responsibility, not the compiler's.
I have no experience of MIPS targets, but the principles are the same for all processors, there is a very useful article called "Building Bare Metal ARM with GNU" which you may find helpful, much of it will be relevant - especially porting the parts regarding implementing Newlib stubs.
Regarding your other question, if your compiler is called mipsel-linux-cpp, then it is not a 'bare-metal' build but rather a Linux build. Also this executable does not really "compile, assemble and link", it is rather a driver that separately calls the pre-processor, compiler, assembler and linker. It has to be configured correctly to invoke the cross-tools rather than the host tools. I generally invoke the linker separately in order to enforce decisions about which standard library to link (-nostdlib), and also because it makes more sense when a application is comprised of multiple execution units. I cannot offer much help other than that here since I have always used GNU-ARM tools built by people with obviously more patience than me, and moreover hosted on Windows, where there is less possibility of the host tool-chain being invoked instead (one reason why I have also avoided those tool-chains that rely on Cygwin)
EDIT
With more time available, I have rewritten my original answer in an attempt to provide something more useful.
I cannot provide a specific answer for your question. I have never tried to get code running on a MIPS machine. What I do have is plenty of experience getting a variety of "bare metal" boards up and running. All kinds of CPUs and all kinds of compilers and cross compilers. So I have an understanding of the principles that apply in all such situations. I will point out the kind of knowledge you will need to absorb before you can hope to succeed with a job like this, and hopefully I can list some links to resources to get you started on learning that knowledge.
I am worried you don't know that pointers are exactly the kind of thing a bare metal compiler can handle, they are a basic machine primitive. This tells me you are probably not an expert embedded developer who is just stuck in this particular scenario. Never mind. There isn't anything magic about programming an embedded system, and you can learn what you need to know.
The first step is getting to understand the relationship between C and the machine you wish to run code on. Basically C is a portable assembly language. This means that C is good for manipulating the basic operations of the machine. In this sense the basic operations of the machine are reading and writing memory locations, performing arithmetic and boolean operations on the data read from memory, and making branching and looping decisions based on that data. In particular the C concept of pointers allows you to manipulate data at locations in memory that you specify.
So far so good, but just doing raw computations in memory is not usually enough - you need a way to input and output data from memory. To do that you need to manipulate the hardware peripherals on your board. If the hardware peripherals are memory mapped then the machine registers used to control the peripherals look exactly like memory locations and C can manipulate them directly. Even in that case though, it is much more likely that doing useful I/O is best handled by extending the C core language with a library of routines provided just for that purpose. These library routines handle all the nasty details (timers, interrupts, non-memory mapped I/O) involved in manipulating the peripheral hardware on the board, and wrap them up with a convenient C function call interface. The idea is that you can go simply printf("hello world"); and the library call take care of the details of displaying the string.
An appropriately skilled developer knows how to adapt an existing I/O library to a new board, or how to develop new library routines to provide access to non-standard custom hardware. The classic way to develop these skills is to start with something simple, usually a LED for an output device, and a switch for an input device. Write a program that pulses a LED in a predictable way, or reads a switch and reflects in on a LED. The first time you get this working will be hugely satisfying.
Okay I have rambled enough. It is time to provide some more resources for you to study. The good news is that there's never been a better time to learn how things work at the interface between hardware and software. There is a wealth of freely available code and docs. Stackoverflow is a great resource as you know. Good luck! Links follow;
Embedded systems overview
Knowing the C language well is fundamental
Why not get your code working on a simulator before you try real hardware
Another emulated environment
Linux device drivers - an overlapping subject
Another book about bare metal programming

How to create a shared object that is statically linked with pthreads and libstdc++ on Linux/gcc?

How to create a shared object that is statically linked with pthreads and libstdc++ on Linux/gcc?
Before I go to answering your question as it was described, I will note that it is not exactly clear what you are trying to achieve in the end, and there is probably a better solution to your problem.
That said - there are two main problems with trying to do what you described:
One is, that you will need to decompose libpthread and libstdc++ to the object files they are made with. This is because ELF binaries (used on Linux) have two levels of "run time" library loading - even when an executable is statically linked, the loader has to load the statically linked libraries within the binary on execution, and map the right memory addresses. This is done before the shared linkage of libraries that are dynamically loaded (shared objects) and mapped to shared memory. Thus, a shared object cannot be statically linked with such libraries, as at the time the object is loaded, all static linked libraries were loaded already. This is one difference between linking with a static library and a plain object file - a static library is not merely glued like any object file into the executable, but still contains separate tables which are referred to on loading. (I believe that this is in contrast to the much simpler static libraries in MS-DOS and classic Windows, .LIB files, but there may be more to those than I remember).
Of course you do not actually have to decompose libpthread and libstdc++, you can just use the object files generated when building them. Collecting them may be a bit difficult though (look for the objects referred to by the final Makefile rule of those libraries). And you would have to use ld directly and not gcc/g++ to link, to avoid linking with the dynamic versions as well.
The second problem is consequential. If you do the above, you will sure have such a shared object / dynamic library as you asked to build. However, it will not be very useful, as once you try to link a regular executable that uses those libpthread/libstdc++ (the latter being any C++ program) with this shared object, it will fail with symbol conflicts - the symbols of the static libpthread/libstdc++ objects you linked your shared object against will clash with the symbols from the standard libpthread/libstdc++ used by that executable, no matter if it is dynamically or statically linked with the standard libraries.
You could of course then try to either hide all symbols in the static objects from libstdc++/libpthread used by your shared library, make them private in some way, or rename them automatically on linkage so that there will be no conflict. However, even if you get that to work, you will find some undesireable results in runtime, since both libstdc++/libpthread keep quite a bit of state in global variables and structures, which you would now have duplicate and each unaware of the other. This will lead to inconsistencies between these global data and the underlying operating system state such as file descriptors and memory bounds (and perhaps some values from the standard C library such as errno for libstdc++, and signal handlers and timers for libpthread.
To avoid over-broad interpretation, I will add a remark: at times there can be sensible grounds for wanting to statically link against even such basic libraries as libstdc++ and even libc, and even though it is becoming a bit more difficult with recent systems and versions of those libraries (due to a bit of coupling with the loader and special linker tricks used), it is definitely possible - I did it a few times, and know of other cases in which it is still done. However, in that case you need to link a whole executable statically. Static linkage with standard libraries combined with dynamic linkage with other objects is not normally feasible.
Edit: One issue which I forgot to mention but is important to take into account is C++ specific. C++ was unfortunately not designed to work well with the classic model of object linkage and loading (used on Unix and other systems). This makes shared libraries in C++ not really portable as they should be, because a lot of things such as type information and templates are not cleanly separated between objects (often being taken, together with a lot of actual library code at compile time from the headers). libstdc++ for that reason is tightly coupled with GCC, and code compiled with one version of g++ will in general only work with the libstdc++ from with this (or a very similar) version of g++. As you will surely notice if you ever try to build a program with GCC 4 with any non-trivial library on your system that was built with GCC 3, this is not just libstdc++. If your reason for wanting to do that is trying to ensure that your shared object is always linked with the specific versions of libstdc++ and libpthread that it was built against, this would not help because a program that uses a different/incompatible libstdc++ would also be built with an incompatible C++ compiler or version of g++, and would thus fail to link with your shared object anyway, aside from the actual libstdc++ conflicts.
If you wonder "why wasn't this done simpler?", a general rumination worth pondering: For C++ to work nicely with dynamic/shared libraries (meaning compatibility across compilers, and the ability to replace a dynamic library with another version with a compatible interface without rebuilding everything that uses it), not just compiler standartization is needed, but at the level of the operating system's loader, the structure and interface of object and library files and the work of the linker would need to be significantly extended beyond the relatively simple Unix classics used on common operating systems (Microsoft Windows, Mach based systems and NeXTStep relatives such as Mac OS, VMS relatives and some mainframe systems also included) for natively built code today. The linker and dynamic loader would need to be aware of such things as templates and typing, having to some extent functionality of a small compiler to actually adapt the library's code to the type given to it - and (personal subjective observation here) it seems that higher-level intermediate intermediate code (together with higher-level languages and just-in-time compilation) is catching ground faster and likely to be standardized sooner than such extensions to the native object formats and linkers.
You mentioned in a separate comment that you are trying to port a C++ library to an embedded device. (I am adding a new answer here instead of editing my original answer here because I think other StackOverflow users interested in this original question may still be interested in that answer in its context)
Obviously, depending on how stripped down your embedded system is (I have not much embedded Linux experience, so I am not sure what is most likely), you may of course be able to just install the shared libstdc++ on it and dynamically link everything as you would do otherwise.
If dynamically linking with libstdc++ would not be good for you or not work on your system (there are so many different levels of embedded systems that one cannot know), and you need to link against a static libstdc++, then as I said, your only real option is static linking the executable using the library with it and libstdc++. You mentioned porting a library to the embedded device, but if this is for the purpose of using it in some code you write or build on the device and you do not mind a static libstdc++, then linking everything statically (aside from perhaps libc) is probably OK.
If the size of libstdc++ is a problem, and you find that your library is actually only using a small part of its interfaces, then I would nonetheless suggest first trying to determine the actual space you would save by linking against only the parts you need. It may be significant or not, I never looked that deep into libstdc++ and I suspect that it has a lot of internal dependencies, so while you surely do not need some of the interfaces, you may or may not still depend on a big part of its internals - I do not know and did not try, but it may surprise you. You can get an idea by just linking a binary using the library against a static build of it and libstdc++ (not forgetting to strip the binary, of course), and comparing the size of the resulting executable that with the total size of a (stripped) executable dynamically linked together with the full (stripped) shared objects of the library and libstdc++.
If you find that the size difference is significant, but do not want to statically link everything, you try to reduce the size of libstdc++ by rebuilding it without some parts you know that you do not need (there are configure-time options for some parts of it, and you can also try to remove some independent objects at the final creation of libstdc++.so. There are some tools to optimize the size of libraries - search the web (I recall one from a company named MontaVista but do not see it on their web site now, there are some others too).
Other than the straightforward above, some ideas and suggestions to think of:
You mentioned that you use uClibc, which I never fiddled with myself (my experience with embedded programming is a lot more primitive, mostly involving assembly programming for the embedded processor and cross-compiling with minimal embedded libraries). I assume you checked this, and I know that uClibc is intended to be a lightweight but rather full standard C library, but do not forget that C++ code is hardly independent on the C library, and g++ and libstdc++ depend on quite some delicate things (I remember problems with libc on some proprietary Unix versions), so I would not just assume that g++ or the GNU libstdc++ actually works with uClibc without trying - I don't recall seeing it mentioned in the uClibc pages.
Also, if this is an embedded system, think of its performance, compute power, overall complexity, and timing/simplicity/solidity requirements. Take into consideration the complexity involved, and think whether using C++ and threads is appropriate in your embedded system, and if nothing else in the system uses those, whether it is worth introducing for that library. It may be, not knowing the library or system I cannot tell (again, embedded systems being such a wide range nowadays).
And in this case also, just a quick link I stumbled upon looking for uClibc -- if you are working on an embedded system, using uClibc, and want to use C++ code on it -- take a look at uClibc++. I do not know how much of the standard C++ stuff you need and it already supports, and it seems to be an ongoing project, so not clear if it is in a state good enough for you already, but assuming that your work is also under development still, it might be a good alternative to GCC's libstdc++ for your embedded work.
I think this guy explains quite well why that wouldn't make sense. C++ code that uses your shared object but a different libstdc++ would link alright, but wouldn't work.

Resources