Is it possible to use SIMD instructions in Rust? - rust

In C/C++, you can use intrinsics for SIMD (such as AVX and AVX2) instructions. Is there a way to use SIMD in Rust?

The answer is yes, with caveats:
It is available on stable for x86 and x86_64 through the core::arch module, reexported as std::arch.
Other CPUs require the use of a nightly compiler for now.
Not all instructions may be available through core::arch, in which case inline assembly is necessary, which also requires a nightly compiler.
The std::arch module only provides CPU instructions as intrinsics, and requires the use of unsafe blocks as well as specific feature on the functions containing those instructions to properly align arguments. The documentation of std::arch is a good starting point for compile-time and run-time detection of CPU features.
As noted in the documentation, higher level APIs will likely be available at some point in the future under std::simd (and possibly core::simd); a sneak preview being available in the stdsimd crate:
Ergonomics
It's important to note that using the arch module is not the easiest thing in the world, so if you're curious to try it out you may want to brace yourself for some wordiness!
The primary purpose of this module is to enable stable crates on crates.io to build up much more ergonomic abstractions which end up using SIMD under the hood. Over time these abstractions may also move into the standard library itself, but for now this module is tasked with providing the bare minimum necessary to use vendor intrinsics on stable Rust.
Note: you may also possibly use the FFI to link in a library that does so for you; for example Shepmaster's cupid crate uses such a strategy to access cpu features at runtime.

Related

how does rust compiler handle manufacturer specified instructions for riscv?

As we know, riscv allow any manufacturer to add their custom instructions for their products, this is especially common in embedded cpu. And also, the manufacturers often provides the user with their modified version of GCC to compile code for there chips.
But how about the rust compiler? It seems that seldom of manufacturer will provide a modified rust compiler for there chips.
Will this be a huge disadvantage for rust when use rust in embedded or low level kernel programming? And how to solve this problem?
This is one of the reasons llvm was invented, instead of having to implement a compiler for every language-architecture pair one has only to implement one frontend for every language and one backend for every architecture, I expect manufactures more and more to shift from providing a custom gcc to provide a custom llvm backend at which point rust will support that target since it builds upon llvm.

Is there any disadvantage to referencing modules through `core` instead of `std`?

Rust's standard library is exposed as two packages: std and core. In terms of API, the functionality in core is the subset of std that can be supported without depending on any operating system integration or heap allocation. When writing imports for my libraries, I've been tempted to always refer to modules via the more-compatible core instead of std, if they're available in both.
However, it's been unclear to me whether their implementations of the same functionality could vary. If I use core::cell::RefCell, could I get an implementation that's less efficient than if I'd referred to std::cell::RefCell?
Is there any disadvantage to referencing modules through core instead of std if they're available in both?
Rust aims to be a general purpose language that can run on many kinds of architectures (x86_64, i686, PowerPC, ARM, RISC-V) and systems (Windows, macOS, Linux) and even embedded systems with no Operating System.
But when you don't have an OS, you don't necessarily have a memory allocator or file handling, because those are things a OS would normally do.
This is where #![no_std] comes into play. If you put that directive in your lib.rs, you will tell the Rust compiler to not link the std crate, but only use core instead. As you said, core is a subset of std and has (mostly) everything that does not require allocation of memory or other things that require an underlying OS.
There is no difference in the actual implementation though. If the function is provided in core, the function in std is just an reexport.
TL;DR: Use std if you have an Operating System running, else use core. There is no need to mix them.

Is there a way to detect the compiler version from within a Rust program?

In C++, you could use something like __clang_version__. Is there something similar for Rust? I searched on the internet, but found nothing.
Not directly.
There is the rustc_version crate which tells you the version of rustc accessible on the command-line; this is designed to be used in a build script. There's also rustc_version_runtime which does something similar, but exposes the information as a runtime call (i.e. it detects the compiler version at compile time, but exposes it at runtime).
Standard disclaimer: be very careful writing anything that depends on compiler version. You should ideally only test for minimum versions for which features are supported using semver (which both of the above libraries support directly).

When writing code compiled by LLVM backend, does architecture matter?

My question is actually more general than the title:
At what point does the architecture matter when writing code that will eventually be compiled to LLVM intermediary code, and then from there to the machine language?
Let's say I'm writing Rust (which uses LLVM as a backend). Am I automatically capable of compiling my Rust code to every architecture that LLVM can target (assuming there's an OS on that machine that can run it)?
Or could it be that the Rust standard library hasn't been made "ARM compatible" yet, so I couldn't compile to ARM even if the LLVM targets it?
What if I don't use any of the standard library, my entire program is just a program that returns right away? Could it be the case that even without any libraries, Rust (or what have you) can't compile to ARM (or what have you) even if the LLVM targets it?
If all the above examples compile just fine, what do I have to do to get my code to break on one architecture not compile to a certain architecture?
Bonus question of the same variety:
Let's say the standard library makes use of OS system calls (which is surely does). Do you have to care about architecture when making system calls? Or does the OS (Linux, for example) abstract away architecture as well?
Thanks.
TL;DR
From my understanding you can compile to any target LLVM supports (there may still be a few caveats here with frontends using inline assembler or module level inline assembly), however, you are not guaranteed it will actually execute correctly. The frontend is responsible for doing the work to be portable across the platforms the author supports.
Note also that as a frontend developer you are responsible for providing the data layout and target triple.
see also:
llvm-bitcode-cross-platform
llvm
FAQ
Implementing Portable
sizeof
Cross Compile with Clang
Your Questions:
Let's say I'm writing Rust (which uses LLVM as a backend). Am I
automatically capable of compiling my Rust code to every architecture
that LLVM can target (assuming there's an OS on that machine that can
run it)?
This is dependent on the authors of the Rust frontend.
Or could it be that the Rust standard library hasn't been made "ARM
compatible" yet, so I couldn't compile to ARM even if the LLVM targets
it?
I'm pretty sure LLVM would be able to emit the instructions, but it may not be correct in terms of addressing.
I have not used the inline assembler facilities mentioned above myself, but I assume if it allows platform specific assembly then this would break platform agnostic compilation as well.
What if I don't use any of the standard library, my entire program is
just a program that returns right away? Could it be the case that even
without any libraries, Rust (or what have you) can't compile to ARM
(or what have you) even if the LLVM targets it?
This again depends on what the Rust frontend emits. There may be some boilerplate setup logic it emits even before it emits instructions for your logic.
I'm writing my own language in LLVM that does this in the case of a special function called "main". I am targeting the C ABI so it will wrap this main with a proper C style main and invoke it with a stricter set of parameters.
If all the above examples compile just fine, what do I have to do to
get my code to break on one architecture not compile to a certain
architecture?
Consider C/C++ with Clang as mentioned in the llvm FAQ. Clang is a frontend, probably the most popular, for LLVM and the users writing C/C++ are responsible for #include-ing the appropriate platform specific functionality.
Some languages may be designed more platform independent and the frontend could then handle the work for you.
Let's say the standard library makes use of OS system calls (which is
surely does). Do you have to care about architecture when making
system calls? Or does the OS (Linux, for example) abstract away
architecture as well?
I'm assuming you are talking about the case where the frontend targets the C standard library in which case LLVM has standard C library intrinsics which could be used by the frontend. This is not the only way, however, as you can use the call instruction to invoke C functions directly if targeting the C ABI as in the Kaleidoscope example.
In the end the standard library can be a portability issue and must be addressed by the frontend developers.

gcc atomic built-in functions: any known conflicts within multithreading environments?

So, I want to avoid future problems when using __sync_fetch_and_add in a context of Boost-based multithreaded application.
Any chance that a low-level threading implementation used by Boost (pthreads here) would affect the functionality of buitins?
The builtins are the intrinsics.
They don't make assumptions about the libraries that will be used in applications.
There is no way it can interfere.
(On a tangent: Some libraries, like Boost Asio, optionally can use C++11 atomics instead of boost::detail::atomic_count (doc))

Resources