Why are ceilf32 and sqrtf32 unsafe? - rust

I'm pretty new to Rust and have been working on some mathematical problems. For one of these problems I needed ceilf32 and sqrtf32. I was surprised to find that these functions are unsafe; both are fairly simple mathematical functions and my understanding is that unsafe Rust is used only as necessary to work around either the conservatism of the compiler or to allow inherently unsafe OS operations. I can't see any reason either function would run into either issue, thus I can't understand what would stop them being implemented with memory safety.
Could someone please enlighten me?

The functions you're looking at are in core::intrinsics, which are low-level compiler instructions. I don't see any official documentation on why they're marked unsafe, but my guess is that all of the compiler intrinsics were marked that way as a rule, since they're lower-level than most of Rust proper.
Regardless, for normal operation, you're looking for the inherent methods f32::ceil and f32::sqrt. These are the Rust standard library implementations that presumably[1] call the intrinsics as a course of action, and these methods are not marked unsafe.
Since they're inherent methods, you can either call them on f32 objects (my_number.sqrt()) or directly with the namespace (f32::sqrt(my_number)).
[1] In fact, a look at the source code for the current implementations indicates that both of these simply delegate to their intrinsic counterpart, wrapping it in an unsafe block to guarantee safety.

Related

When should I use Pin<Arc<T>> in Rust?

I'm pretty new to Rust and have a couple different implementations of a method that includes a closure referencing self. To use the reference in the closure effectively, I've been using Arc<Self> (I am multithreading) and Pin<Arc<Self>>.
I would like to make this method as generally memory efficient as possible. I assume pinning the Arc in memory would help with this. However, (a) I've read that Arcs are pinned and (b) it seems like Pin<Arc<T>> may require additional allocations.
What is Pin<Arc<T>> good for?
Adding Pin around some pointer type does not change the behavior of the program. It only adds a restriction on what further code you can write (and even that, only if the T in Pin<Arc<T>> is not Unpin, which most types are).
Therefore, there is no "memory efficiency" to be gained by adding Pin.
The only use of Pin is to allow working with types that require they be pinned to use them, such as Futures.

Can I avoid using explicit lifetime specifiers and instead use reference counting (Rc)?

I am reading the Rust Book and everything was pretty simple to understand (thanks to the book's authors), until the section about lifetimes. I spent all day, reading a lot of articles on lifetimes and still I am very insecure about using them correctly.
What I do understand, though, is that the concept of explicit lifetime specifiers aims to solve the problem of dangling references. I also know that Rust has reference-counting smart pointers (Rc) which I believe is the same as shared_ptr in C++, which has the same purpose: to prevent dangling references.
Given that those lifetimes are so horrendous to me, and smart pointers are very familiar and comfortable for me (I used them in C++ a lot), can I avoid the lifetimes in favor of smart pointers? Or are lifetimes an inevitable thing that I'll have to understand and use in Rust code?
are lifetimes an inevitable thing that I'll have to understand and use in Rust code?
In order to read existing Rust code, you probably don't need to understand lifetimes. The borrow-checker understands them so if it compiles then they are correct and you can just review what the code does.
I am very insecure about using them correctly.
The most important thing to understand about lifetimes annotations is that they do nothing. Rather, they are a way to express to the compiler the relationship between references. For example, if an input and output to a function have the same lifetime, that means that the output contains a reference to the input (or part of it) and therefore is not allowed to live longer than the input. Using them "incorrectly" means that you are telling the compiler something about the lifetime of a reference which it can prove to be untrue - and it will give you an error, so there is nothing to be insecure about!
can I avoid the lifetimes in favor of smart pointers?
You could choose to avoid using references altogether and use Rc everywhere. You would be missing out on one of the big features of Rust: lifetimes and references form one of the most important zero-cost abstractions, which enable Rust to be fast and safe at the same time. There is code written in Rust that nobody would attempt to write in C/C++ because a human could never be absolutely certain that they haven't introduced a memory bug. Avoiding Rust references in favour of smart pointers will mostly result in slower code, because smart pointers have runtime overhead.
Many APIs use references. In order to use those APIs you will need to have at least some grasp of what is going on.
The best way to understand is just to write code and gain an intuition from what works and what doesn't. Rust's error messages are excellent and will help a lot with forming that intuition.

When should inline be used in Rust?

Rust has an "inline" attribute that can be used in one of those three flavors:
#[inline]
#[inline(always)]
#[inline(never)]
When should they be used?
In the Rust reference, we see an inline attributes section saying
The compiler automatically inlines functions based on internal heuristics. Incorrectly inlining functions can actually make the program slower, so it should be used with care.
In the Rust internals forum, huon was also conservative about specifying inline.
But we see considerable usage in the Rust source, including the standard library. A lot of inline attributes are added to one-line-functions, which should be easy for the compilers to spot and optimize through heuristics according to the reference. Are those in fact not needed?
One limitation of the current Rust compiler is that it if you're not using LTO (Link-Time Optimization), it will never inline a function not marked #[inline] across crates. Rust uses a separate compilation model similar to C++ because LLVM's LTO implementation doesn't scale well to large projects. Therefore, small functions exposed to other crates need to be marked by hand. This isn't a great situation, and it's likely to be fixed in the future by some combination of improvements to LTO and MIR inlining.
#[inline(never)] is sometimes useful for debugging (separating a piece of code which isn't working as expected). In theory, it can be used for benchmarking, but that's usually a bad idea: turning off inlining doesn't prevent other inter-procedural optimizations like constant propagation. In terms of normal code, it can reduce codesize if you have a frequently used helper function which is only used for error handling.
#[inline(always)] is generally bad idea; if a function is big enough that the compiler won't inline it by default, it's big enough that the overhead of the call doesn't matter (and excessive inlining increases instruction cache pressure). There are exceptions, but you need performance measurements to justify it. This example is the sort of situation where it's worth considering. #[inline(always)] can also be used to improve -O0 code quality, but that's not usually worth worrying about.

Why is immutability enforced in Rust unless otherwise specified with `mut`?

Why is immutability forced in Rust, unless you specify mut? Is this a design choice for safety, do you consider this how it should be naturally in other languages?
I should probably clarify, I'm still a newbie at Rust. So is this a design choice related to another feature in the language?
The Rust-Book actually addresses this topic.
There is no single reason that bindings are immutable by default, but we can think about it through one of Rust’s primary focuses: safety. If you forget to say mut, the compiler will catch it, and let you know that you have mutated something you may not have intended to mutate. If bindings were mutable by default, the compiler would not be able to tell you this. If you did intend mutation, then the solution is quite easy: add mut.
There are other good reasons to avoid mutable state when possible, but they’re out of the scope of this guide. In general, you can often avoid explicit mutation, and so it is preferable in Rust. That said, sometimes, mutation is what you need, so it’s not verboten.
Basically it is the C++-Mantra that everything that you don't want to modify should be const, just properly done by reversing the rules. Also see this Stackoverflow article about C++.

Multiple specialization, iterator patterns in Rust

Learning Rust (yay!) and I'm trying to understand the intended idiomatic programming required for certain iterator patterns, while scoring top performance. Note: not Rust's Iterator trait, just a method I've written accepting a closure and applying it to some data I'm pulling off of disk / out of memory.
I was delighted to see that Rust (+LLVM?) took an iterator I had written for sparse matrix entries, and a closure for doing sparse matrix vector multiplication, written as
iterator.map_edges({ |x, y| dst[y] += src[x] });
and inlined the closure's body in the generated code. It went quite fast. :D
If I create two of these iterators, or use the first a second time (not a correctness issue) each instance slows down quite a lot (about 2x in this case), presumably because the optimizer no longer chooses to do specialization because of the multiple call sites, and you end up doing a function call for each element.
I'm trying to understand if there are idiomatic patterns that keep the pleasant experience above (I like it, at least) without sacrificing the performance. My options seem to be (none satisfying this constraint):
Accept dodgy performance (2x slower is not fatal, but no prizes either).
Ask the user to supply a batch-oriented closure, so acting on an iterator over a small batch of data. This exposes a bit much of the internals of the iterator (the data are compressed nicely, and the user needs to know how to unwrap them, or the iterator needs to stage an unwrapped batch in memory).
Make map_edges generic in a type implementing a hypothetical EdgeMapClosure trait, and ask the user to implement such a type for each closure they want to inline. Not tested, but I would guess this exposes distinct methods to LLVM, each of which get nicely inlined. Downside is that the user has to write their own closure (packing relevant state up, etc).
Horrible hacks, like make distinct methods map_edges0, map_edges1, ... . Or add a generic parameter the programmer can use to make the methods distinct, but which is otherwise ignored.
Non-solutions include "just use for pair in iterator.iter() { /* */ }"; this is prep work for a data/task-parallel platform, and I would like to be able to capture/move these closures to work threads rather than capturing the main thread's execution. Maybe the pattern I should be using is to write the above, put it in a lambda/closure, and ship it around instead?
In a perfect world, it would be great to have a pattern which causes each occurrence of map_edges in the source file to result in different specialized methods in the binary, without forcing the entire project to be optimized at some scary level. I'm coming out of an unpleasant relationship with managed languages and JITs where generics would be the only way (I know of) to get this to happen, but Rust and LLVM seem magical enough that I thought there might be a good way. How do Rust's iterators handle this to inline their closure bodies? Or don't they (they should!)?
It seems that the problem is resolved by Rust's new approach to closures outlined at
http://smallcultfollowing.com/babysteps/blog/2014/11/26/purging-proc/
In short, Option 3 above (make functions generic with respect to a new closure type) is now transparently implemented when you make an implementation generic using the new closure traits. Rust produces the type behind the scenes for you.

Resources