how does rust compiler handle manufacturer specified instructions for riscv? - rust

As we know, riscv allow any manufacturer to add their custom instructions for their products, this is especially common in embedded cpu. And also, the manufacturers often provides the user with their modified version of GCC to compile code for there chips.
But how about the rust compiler? It seems that seldom of manufacturer will provide a modified rust compiler for there chips.
Will this be a huge disadvantage for rust when use rust in embedded or low level kernel programming? And how to solve this problem?

This is one of the reasons llvm was invented, instead of having to implement a compiler for every language-architecture pair one has only to implement one frontend for every language and one backend for every architecture, I expect manufactures more and more to shift from providing a custom gcc to provide a custom llvm backend at which point rust will support that target since it builds upon llvm.

Related

Is it possible to use SIMD instructions in Rust?

In C/C++, you can use intrinsics for SIMD (such as AVX and AVX2) instructions. Is there a way to use SIMD in Rust?
The answer is yes, with caveats:
It is available on stable for x86 and x86_64 through the core::arch module, reexported as std::arch.
Other CPUs require the use of a nightly compiler for now.
Not all instructions may be available through core::arch, in which case inline assembly is necessary, which also requires a nightly compiler.
The std::arch module only provides CPU instructions as intrinsics, and requires the use of unsafe blocks as well as specific feature on the functions containing those instructions to properly align arguments. The documentation of std::arch is a good starting point for compile-time and run-time detection of CPU features.
As noted in the documentation, higher level APIs will likely be available at some point in the future under std::simd (and possibly core::simd); a sneak preview being available in the stdsimd crate:
Ergonomics
It's important to note that using the arch module is not the easiest thing in the world, so if you're curious to try it out you may want to brace yourself for some wordiness!
The primary purpose of this module is to enable stable crates on crates.io to build up much more ergonomic abstractions which end up using SIMD under the hood. Over time these abstractions may also move into the standard library itself, but for now this module is tasked with providing the bare minimum necessary to use vendor intrinsics on stable Rust.
Note: you may also possibly use the FFI to link in a library that does so for you; for example Shepmaster's cupid crate uses such a strategy to access cpu features at runtime.

Difference between arm-none-eabi and arm-linux-gnueabi?

What is the difference between arm-none-eabi and arm-linux-gnueabi? I know the difference in how to use them (one for bare metal software, the other one for software meant to be run on linux). But what is the technical background?
I see there is a difference in the ABI which is, as far as I understood, something like an API but on binary level. It ensures interoperability of different applications.
But I don't really understand in which way having or not having an operating system affects my toolchain. The only thing that came to my mind is, that libraries maybe have to be statically linked (do they?) while compiling bare metal software, because there is no os dynamically providing them.
The most pages I found related to this toppic just answered how to use the toolchains but not the technical background. I'm a student of mechatronics and new to embedded systems, so my experience in this field is somewhat limited.
Maybe this link to a description will help.
Probably the biggest difference:
"The bare-metal ABI will assume a different C library (newlib for example, or even no C library) to the Linux ABI (which assumes glibc). Therefore, the compiler may make different function calls depending on what it believes is available above and beyond the Standard C library."

When writing code compiled by LLVM backend, does architecture matter?

My question is actually more general than the title:
At what point does the architecture matter when writing code that will eventually be compiled to LLVM intermediary code, and then from there to the machine language?
Let's say I'm writing Rust (which uses LLVM as a backend). Am I automatically capable of compiling my Rust code to every architecture that LLVM can target (assuming there's an OS on that machine that can run it)?
Or could it be that the Rust standard library hasn't been made "ARM compatible" yet, so I couldn't compile to ARM even if the LLVM targets it?
What if I don't use any of the standard library, my entire program is just a program that returns right away? Could it be the case that even without any libraries, Rust (or what have you) can't compile to ARM (or what have you) even if the LLVM targets it?
If all the above examples compile just fine, what do I have to do to get my code to break on one architecture not compile to a certain architecture?
Bonus question of the same variety:
Let's say the standard library makes use of OS system calls (which is surely does). Do you have to care about architecture when making system calls? Or does the OS (Linux, for example) abstract away architecture as well?
Thanks.
TL;DR
From my understanding you can compile to any target LLVM supports (there may still be a few caveats here with frontends using inline assembler or module level inline assembly), however, you are not guaranteed it will actually execute correctly. The frontend is responsible for doing the work to be portable across the platforms the author supports.
Note also that as a frontend developer you are responsible for providing the data layout and target triple.
see also:
llvm-bitcode-cross-platform
llvm
FAQ
Implementing Portable
sizeof
Cross Compile with Clang
Your Questions:
Let's say I'm writing Rust (which uses LLVM as a backend). Am I
automatically capable of compiling my Rust code to every architecture
that LLVM can target (assuming there's an OS on that machine that can
run it)?
This is dependent on the authors of the Rust frontend.
Or could it be that the Rust standard library hasn't been made "ARM
compatible" yet, so I couldn't compile to ARM even if the LLVM targets
it?
I'm pretty sure LLVM would be able to emit the instructions, but it may not be correct in terms of addressing.
I have not used the inline assembler facilities mentioned above myself, but I assume if it allows platform specific assembly then this would break platform agnostic compilation as well.
What if I don't use any of the standard library, my entire program is
just a program that returns right away? Could it be the case that even
without any libraries, Rust (or what have you) can't compile to ARM
(or what have you) even if the LLVM targets it?
This again depends on what the Rust frontend emits. There may be some boilerplate setup logic it emits even before it emits instructions for your logic.
I'm writing my own language in LLVM that does this in the case of a special function called "main". I am targeting the C ABI so it will wrap this main with a proper C style main and invoke it with a stricter set of parameters.
If all the above examples compile just fine, what do I have to do to
get my code to break on one architecture not compile to a certain
architecture?
Consider C/C++ with Clang as mentioned in the llvm FAQ. Clang is a frontend, probably the most popular, for LLVM and the users writing C/C++ are responsible for #include-ing the appropriate platform specific functionality.
Some languages may be designed more platform independent and the frontend could then handle the work for you.
Let's say the standard library makes use of OS system calls (which is
surely does). Do you have to care about architecture when making
system calls? Or does the OS (Linux, for example) abstract away
architecture as well?
I'm assuming you are talking about the case where the frontend targets the C standard library in which case LLVM has standard C library intrinsics which could be used by the frontend. This is not the only way, however, as you can use the call instruction to invoke C functions directly if targeting the C ABI as in the Kaleidoscope example.
In the end the standard library can be a portability issue and must be addressed by the frontend developers.

Bare metal cross compilers input

What are the input limitations of a bare metal cross compiler...as in does it not compile programs with pointers or mallocs......or anything that would require more than the underlying hardware....also how can 1 find these limitations..
I also wanted to ask...I built a cross compiler for target mips..i need to create a mips executable using this cross compiler...but i am not able to find where the executable is...as in there is 1 executable which i found mipsel-linux-cpp which is supposed to compile,assemble and link and then produce a.out but it is not doing so...
However the ./cc1 gives a mips assembly.......
There is an install folder which has a gcc executable which uses i386 assembly and then gives an exe...i dont understand how can the gcc exe give i386 and not mips assembly when i have specified target as mips....
please help im really not able to understand what is happ...
I followed the foll steps..
1. Installed binutils 2.19
2. configured gcc for mips..(g++,core)
I would suggest that you should have started two separate questions.
The GNU toolchain does not have any OS dependencies, but the GNU library does. Most bare-metal cross builds of GCC use the Newlib C library which provides a set of syscall stubs that you must map to your target yourself. These stubs include low-level calls necessary to implement stream I/O and heap management. They can be very simple or very complex depending on your needs. If the only I/O support is to a UART to stdin/stdout/stderr, then it is simple. You don't have to implement everything, but if you do not implement teh I/O stubs, you won't be able to use printf() for example. You must implement the sbrk()/sbrk_r() syscall is you want malloc() to work.
The GNU C++ library will work correctly with Newlib as its underlying library. If you use C++, the C runtime start-up (usually crt0.s) must include the static initialiser loop to invoke the constructors of any static objects that your code may include. The run-time start-up must also of course initialise the processor, clocks, SDRAM controller, timers, MMU etc; that is your responsibility, not the compiler's.
I have no experience of MIPS targets, but the principles are the same for all processors, there is a very useful article called "Building Bare Metal ARM with GNU" which you may find helpful, much of it will be relevant - especially porting the parts regarding implementing Newlib stubs.
Regarding your other question, if your compiler is called mipsel-linux-cpp, then it is not a 'bare-metal' build but rather a Linux build. Also this executable does not really "compile, assemble and link", it is rather a driver that separately calls the pre-processor, compiler, assembler and linker. It has to be configured correctly to invoke the cross-tools rather than the host tools. I generally invoke the linker separately in order to enforce decisions about which standard library to link (-nostdlib), and also because it makes more sense when a application is comprised of multiple execution units. I cannot offer much help other than that here since I have always used GNU-ARM tools built by people with obviously more patience than me, and moreover hosted on Windows, where there is less possibility of the host tool-chain being invoked instead (one reason why I have also avoided those tool-chains that rely on Cygwin)
EDIT
With more time available, I have rewritten my original answer in an attempt to provide something more useful.
I cannot provide a specific answer for your question. I have never tried to get code running on a MIPS machine. What I do have is plenty of experience getting a variety of "bare metal" boards up and running. All kinds of CPUs and all kinds of compilers and cross compilers. So I have an understanding of the principles that apply in all such situations. I will point out the kind of knowledge you will need to absorb before you can hope to succeed with a job like this, and hopefully I can list some links to resources to get you started on learning that knowledge.
I am worried you don't know that pointers are exactly the kind of thing a bare metal compiler can handle, they are a basic machine primitive. This tells me you are probably not an expert embedded developer who is just stuck in this particular scenario. Never mind. There isn't anything magic about programming an embedded system, and you can learn what you need to know.
The first step is getting to understand the relationship between C and the machine you wish to run code on. Basically C is a portable assembly language. This means that C is good for manipulating the basic operations of the machine. In this sense the basic operations of the machine are reading and writing memory locations, performing arithmetic and boolean operations on the data read from memory, and making branching and looping decisions based on that data. In particular the C concept of pointers allows you to manipulate data at locations in memory that you specify.
So far so good, but just doing raw computations in memory is not usually enough - you need a way to input and output data from memory. To do that you need to manipulate the hardware peripherals on your board. If the hardware peripherals are memory mapped then the machine registers used to control the peripherals look exactly like memory locations and C can manipulate them directly. Even in that case though, it is much more likely that doing useful I/O is best handled by extending the C core language with a library of routines provided just for that purpose. These library routines handle all the nasty details (timers, interrupts, non-memory mapped I/O) involved in manipulating the peripheral hardware on the board, and wrap them up with a convenient C function call interface. The idea is that you can go simply printf("hello world"); and the library call take care of the details of displaying the string.
An appropriately skilled developer knows how to adapt an existing I/O library to a new board, or how to develop new library routines to provide access to non-standard custom hardware. The classic way to develop these skills is to start with something simple, usually a LED for an output device, and a switch for an input device. Write a program that pulses a LED in a predictable way, or reads a switch and reflects in on a LED. The first time you get this working will be hugely satisfying.
Okay I have rambled enough. It is time to provide some more resources for you to study. The good news is that there's never been a better time to learn how things work at the interface between hardware and software. There is a wealth of freely available code and docs. Stackoverflow is a great resource as you know. Good luck! Links follow;
Embedded systems overview
Knowing the C language well is fundamental
Why not get your code working on a simulator before you try real hardware
Another emulated environment
Linux device drivers - an overlapping subject
Another book about bare metal programming

Is assembler portable between Linux distros?

Is a program shipped in assembler format portable between Linux distributions (modulo CPU architecture differences)?
Here's the background to my question: I'm working on a new programming language (named Aklo), whose modus operandi will be the classic compiling to .s and feeding the result to the GNU assembler.
Obviously it would be nice ultimately to have the implementation written in itself, but I had resigned myself to maintaining it in C++ to solve the chicken and egg problem: suppose you download the compiler for the first time and it is itself written in Aklo, how do you compile it? As I understand it, different Linux distributions and other UNIX like systems have different conventions for binary formats.
But it's just occurred to me, a solution might be to ship the .s file (well, one per CPU architecture): it's fair to assume you have or can install the GNU assembler. Of course I'd still need a bootstrap compiler, but that doesn't need to be fast; I can write it in Python.
Is assembler portable in the way that binaries are not? Are there any other stumbling blocks I haven't thought of?
Added in response to one answer:
I had looked wistfully at LLVM, there is certainly a lot of good stuff there and it would make my life easier -- except that it would incur a dependency on the correct version of LLVM being installed. It wouldn't be so bad having that dependency on development machines, but in a world where it's common to ship programs as source, the same dependency would be incurred for every user of every program ever written in Aklo, and I decided that was too high a price to pay.
But if the solution of shipping compiled programs as assembler works... then that solves that problem, and I can use LLVM after all, which would be a big win.
So the question about portability of assembler is even considerably more important than I had first realized.
Conclusion: from answers here and on the LLVM mailing list http://lists.cs.uiuc.edu/pipermail/llvmdev/2010-January/028991.html it seems the bad news is the problem is unsolvable, but the good news is that means using LLVM makes it no worse, so I'm free to do so and obtain all the advantages thereof.
You might want to check out LLVM before going down this particular path. It might make your life a lot easier, as it provides a low level virtual machine that makes a lot of hard stuff just work and has been very popular.
At a very high level, the ABI consists of { instruction set, system calls, binary format, libraries }.
Distribution as .s may free you from the binary format. This is still rather pointless, because you are fixed to a particular ISA and still need to use libraries and/or make system calls. Libraries vary from distribution to distribution (although this isn't really that bad, especially if you just use libc) and syscalls vary from OS to OS.
It's basically 20 years since I last bootstrapped a C compiler. At the level of compilers, the differences between Linux distributions are minimal.
The much more important reason for going LLVM is cross-platform; if you're not writing some intermediate language, your compiler will be extremely difficult to retarget for different processors. And seeing as, on my laptop, I have compilers for x86, x86_64, two kinds of MIPS, PowerPC, ARM and AVR... you see where I'm going? I can compile multiple languages for most of those targets too (only C for AVR).

Resources