What are the semantics for dereferencing raw pointers? - rust

For shared references and mutable references the semantics are clear: as
long as you have a shared reference to a value, nothing else must have
mutable access, and a mutable reference can't be shared.
So this code:
#[no_mangle]
pub extern fn run_ref(a: &i32, b: &mut i32) -> (i32, i32) {
let x = *a;
*b = 1;
let y = *a;
(x, y)
}
compiles (on x86_64) to:
run_ref:
movl (%rdi), %ecx
movl $1, (%rsi)
movq %rcx, %rax
shlq $32, %rax
orq %rcx, %rax
retq
Note that the memory a points to is only read once, because the
compiler knows the write to b must not have modified the memory at
a.
Raw pointer are more complicated. Raw pointer arithmetic and casts are
"safe", but dereferencing them is not.
We can convert raw pointers back to shared and mutable references, and
then use them; this will certainly imply the usual reference semantics,
and the compiler can optimize accordingly.
But what are the semantics if we use raw pointers directly?
#[no_mangle]
pub unsafe extern fn run_ptr_direct(a: *const i32, b: *mut f32) -> (i32, i32) {
let x = *a;
*b = 1.0;
let y = *a;
(x, y)
}
compiles to:
run_ptr_direct:
movl (%rdi), %ecx
movl $1065353216, (%rsi)
movl (%rdi), %eax
shlq $32, %rax
orq %rcx, %rax
retq
Although we write a value of different type, the second read still goes
to memory - it seems to be allowed to call this function with the same
(or overlapping) memory location for both arguments. In other words, a
const raw pointer does not forbid a coexisting mut raw pointer; and
its probably fine to have two mut raw pointers (of possibly different
types) to the same (or overlapping) memory location too.
Note that a normal optimizing C/C++-compiler would eliminate the second
read (due to the "strict aliasing" rule: modfying/reading the same
memory location through pointers of different ("incompatible") types is
UB in most cases):
struct tuple { int x; int y; };
extern "C" tuple run_ptr(int const* a, float* b) {
int const x = *a;
*b = 1.0;
int const y = *a;
return tuple{x, y};
}
compiles to:
run_ptr:
movl (%rdi), %eax
movl $0x3f800000, (%rsi)
movq %rax, %rdx
salq $32, %rdx
orq %rdx, %rax
ret
Playground with Rust code examples
godbolt Compiler Explorer with C example
So: What are the semantics if we use raw pointers directly: is it ok for
referenced data to overlap?
This should have direct implications on whether the compiler is allowed
to reorder memory access through raw pointers.

No awkward strict-aliasing here
C++ strict-aliasing is a patch on a wooden leg. C++ does not have any aliasing information, and the absence of aliasing information prevents a number of optimizations (as you noted here), therefore to regain some performance strict-aliasing was patched on...
Unfortunately, strict-aliasing is awkward in a systems language, because reinterpreting raw-memory is the essence of what systems language are designed to do.
And doubly unfortunately it does not enable that many optimizations. For example, copying from one array to another must assume that the arrays may overlap.
restrict (from C) is a bit more helpful, although it only applies to one level at a time.
Instead, we have scope-based aliasing analysis
The essence of the aliasing analysis in Rust is based on lexical scopes (barring threads).
The beginner level explanation that you probably know is:
if you have a &T, then there is no &mut T to the same instance,
if you have a &mut T, then there is no &T or &mut T to the same instance.
As suited to a beginner, it is a slightly abbreviated version. For example:
fn main() {
let mut i = 32;
let mut_ref = &mut i;
let x: &i32 = mut_ref;
println!("{}", x);
}
is perfectly fine, even though both a &mut i32 (mut_ref) and a &i32 (x) point to the same instance!
If you try to access mut_ref after forming x, however, the truth is unveiled:
fn main() {
let mut i = 32;
let mut_ref = &mut i;
let x: &i32 = mut_ref;
*mut_ref = 2;
println!("{}", x);
}
error[E0506]: cannot assign to `*mut_ref` because it is borrowed
|
4 | let x: &i32 = mut_ref;
| ------- borrow of `*mut_ref` occurs here
5 | *mut_ref = 2;
| ^^^^^^^^^^^^ assignment to borrowed `*mut_ref` occurs here
So, it is fine to have both &mut T and &T pointing to the same memory location at the same time; however mutating through the &mut T will be disabled for as long as the &T exists.
In a sense, the &mut T is temporarily downgraded to a &T.
So, what of pointers?
First of all, let's review the reference:
are not guaranteed to point to valid memory and are not even guaranteed to be non-NULL (unlike both Box and &);
do not have any automatic clean-up, unlike Box, and so require manual resource management;
are plain-old-data, that is, they don't move ownership, again unlike Box, hence the Rust compiler cannot protect against bugs like use-after-free;
lack any form of lifetimes, unlike &, and so the compiler cannot reason about dangling pointers; and
have no guarantees about aliasing or mutability other than mutation not being allowed directly through a *const T.
Conspicuously absent is any rule forbidding from casting a *const T to a *mut T. That's normal, it's allowed, and therefore the last point is really more of a lint, since it can be so easily worked around.
Nomicon
A discussion of unsafe Rust would not be complete without pointing to the Nomicon.
Essentially, the rules of unsafe Rust are rather simple: uphold whatever guarantee the compiler would have if it was safe Rust.
This is not as helpful as it could be, since those rules are not set in stone yet; sorry.
Then, what are the semantics for dereferencing raw pointers?
As far as I know1:
if you form a reference from the raw pointer (&T or &mut T) then you must ensure that the aliasing rules these references obey are upheld,
if you immediately read/write, this temporarily forms a reference.
That is, providing that the caller had mutable access to the location:
pub unsafe fn run_ptr_direct(a: *const i32, b: *mut f32) -> (i32, i32) {
let x = *a;
*b = 1.0;
let y = *a;
(x, y)
}
should be valid, because *a has type i32, so there is no overlap of lifetime in references.
However, I would expect:
pub unsafe fn run_ptr_modified(a: *const i32, b: *mut f32) -> (i32, i32) {
let x = &*a;
*b = 1.0;
let y = *a;
(*x, y)
}
To be undefined behavior, because x would be live while *b is used to modify its memory.
Note how subtle the change is. It's easy to break invariants in unsafe code.
1 And I might be wrong right now, or I may become wrong in the future

Related

Does moving ownership copy the `self` struct when calling a function?

In my example below does cons.push(...) ever copy the self parameter?
Or is rustc intelligent enough to realize that the values coming from lines #a and #b can always use the same stack space and no copying needs to occur (except for the obvious i32 copies)?
In other words, does a call to Cons.push(self, ...) always create a copy of self as ownership is being moved? Or does the self struct always stay in place on the stack?
References to documentation would be appreciated.
#[derive(Debug)]
struct Cons<T, U>(T, U);
impl<T, U> Cons<T, U> {
fn push<V>(self, value: V) -> Cons<Self, V> {
Cons(self, value)
}
}
fn main() {
let cons = Cons(1, 2); // #a
let cons = cons.push(3); // #b
println!("{:?}", cons); // #c
}
The implication in my example above is whether or not the push(...) function grows more expensive to call each time we add a line like #b at the rate of O(n^2) (if self is copied each time) or at the rate of O(n) (if self stays in place).
I tried implementing the Drop trait and noticed that both #a and #b were dropped after #c. To me this seems to indicate that self stays in place in this example, but I'm not 100%.
In general, trust in the compiler! Rust + LLVM is a very powerful combination that often produces surprisingly efficient code. And it will improve even more in time.
In other words, does a call to Cons.push(self, ...) always create a copy of self as ownership is being moved? Or does the self struct always stay in place on the stack?
self cannot stay in place because the new value returned by the push method has type Cons<Self, V>, which is essentially a tuple of Self and V. Although tuples don't have any memory layout guarantees, I strongly believe they can't have their elements scattered arbitrarily in memory. Thus, self and value must both be moved into the new structure.
Above paragraph assumed that self was placed firmly on the stack before calling push. The compiler actually has enough information to know it should reserve enough space for the final structure. Especially with function inlining this becomes a very likely optimization.
The implication in my example above is whether or not the push(...) function grows more expensive to call each time we add a line like #b at the rate of O(n^2) (if self is copied each time) or at the rate of O(n) (if self stays in place).
Consider two functions (playground):
pub fn push_int(cons: Cons<i32, i32>, x: i32) -> Cons<Cons<i32, i32>, i32> {
cons.push(x)
}
pub fn push_int_again(
cons: Cons<Cons<i32, i32>, i32>,
x: i32,
) -> Cons<Cons<Cons<i32, i32>, i32>, i32> {
cons.push(x)
}
push_int adds a third element to a Cons and push_int_again adds a fourth element.
push_int compiles to the following assembly in Release mode:
movq %rdi, %rax
movl %esi, (%rdi)
movl %edx, 4(%rdi)
movl %ecx, 8(%rdi)
retq
And push_int_again compiles to:
movq %rdi, %rax
movl 8(%rsi), %ecx
movl %ecx, 8(%rdi)
movq (%rsi), %rcx
movq %rcx, (%rdi)
movl %edx, 12(%rdi)
retq
You don't need to understand assembly to see that pushing the fourth element requires more instructions than pushing the third element.
Note that this observation was made for these functions in isolation. Calls like cons.push(x).push(y).push(...) are inlined and the assembly grows linearly with one instruction per push.
The ownership of cons in #a type Cons will be transferred in push(). Again the ownership will be transferred to Cons<Cons,i32>(Cons<T,U>) type which is shadowed variable cons in #b.
If struct Cons implement Copy, Clone traits it will be copy. Otherwise no copy and you cannot use the original vars after they are moved (or owned) by someone else.
Move semantics:
let cons = Cons(1, 2); //Cons(1,2) as resource in memory being pointed by cons
let cons2 = cons; // Cons(1,2) now pointed by cons2. Problem! as cons also point it. Lets prevent access from cons
println!("{:?}", cons); //error because cons is moved

How to safely get an immutable byte slice from a `&mut [u32]`?

In a rather low level part of a project of mine, a function receives a mutable slice of primitive data (&mut [u32] in this case). This data should be written to a writer in little endian.
Now, this alone wouldn't be a problem, but all of this has to be fast. I measured my application and identified this as one of the critical paths. In particular, if the endianness doesn't need to be changed (since we're already on a little endian system), there shouldn't be any overhead.
This is my code (Playground):
use std::{io, mem, slice};
fn write_data(mut w: impl io::Write, data: &mut [u32]) -> Result<(), io::Error> {
adjust_endianness(data);
// Is this safe?
let bytes = unsafe {
let len = data.len() * mem::size_of::<u32>();
let ptr = data.as_ptr() as *const u8;
slice::from_raw_parts(ptr, len)
};
w.write_all(bytes)
}
fn adjust_endianness(_: &mut [u32]) {
// implementation omitted
}
adjust_endianness changes the endianness in place (which is fine, since a wrong-endian u32 is garbage, but still a valid u32).
This code works, but the critical question is: Is this safe? In particular, at some point, data and bytes both exist, being one mutable and one immutable slice to the same data. That sounds very bad, right?
On the other hand, I can do this:
let bytes = &data[..];
That way, I also have those two slices. The difference is just that data is now borrowed.
Is my code safe or does it exhibit UB? Why? If it's not safe, how to safely do what I want to do?
In general, creation of slices that violate Rust's safety rules, even briefly, is unsafe. If you cheat the borrow checker and make independent slices borrowing the same data as & and &mut at the same time, it will make Rust specify incorrect aliasing information in LLVM, and this may lead to actually miscompiled code. Miri doesn't flag this case, because you're not using data afterwards, but the exact details of what is unsafe are still being worked out.
To be safe, you should to explain the sharing situation to the borrow checker:
let shared_data = &data[..];
data will be temporarily reborrowed as shared/read-only for the duration shared_data is used. In this case it shouldn't cause any limitations. The data will keep being mutable after exiting this scope.
Then you'll have &[u32], but you need &[u8]. Fortunately, this conversion is safe to do, because both are shared, and u8 has lesser alignment requirement than u32 (if it was the other way, you'd have to use align_to!).
let shared_data = &data[..];
let bytes = unsafe {
let len = shared_data.len() * mem::size_of::<u32>();
let ptr = data.as_ptr() as *const u8;
slice::from_raw_parts(ptr, len)
};

Why is casting a const reference directly to a mutable reference invalid in Rust?

This piece of code is correct:
fn f() {
let mut x = 11;
b(&x as *const u8 as *mut u8);
}
fn b(x: *mut u8) {}
Why is b(&x as *const u8 as *mut u8) is valid whereas b(&x as *mut u8) is invalid? The compiler complains about:
error[E0606]: casting &u8 as *mut u8 is invalid
The superficial answer to the question "why?" is that these simply are the rules of as expressions in Rust. Quoting from the Nomicon:
Casting is not transitive, that is, even if e as U1 as U2 is a valid
expression, e as U2 is not necessarily so.
With the as operator, you can either perform explicit coercions or casts.
There is neither a cast nor a coercion to go directly from &u8 to *mut u8. However, there is a pointer weakening coercion from &T to *const T and a cast from a pointer to a sized type to any other. The combination of the two results in the expression in your question.
The deeper question is why the language was designed this way. I don't actually know, since I wasn't in the room when these decisions were made, and I couldn't find a rationale on the web. Rust in general tries to be very explicit with type casts, to avoid conversions that weren't actually intended and to keep the rules simple. These princilpes seem to have influenced this particular design decision as well.
Because a rust reference is constant.
What you want to do is to cast a mutable reference &mut x to a raw integer mutable pointer *mut i32.
This code is valid:
let mut x = 42;
let ptr = &mut x as *mut _;
// equivalent to
let ptr_2 = &x as *const _ as *mut _;

Is casting between integers expensive?

I am working on a project where I am doing a lot of index-based calculation. I have a few lines like:
let mut current_x: usize = (start.x as isize + i as isize * delta_x) as usize;
start.x and i are usizes and delta_x is of type isize. Most of my data is unsigned, therefore storing it signed would not make much sense. On the other hand, when I manipulate an array I am accessing a lot I have to convert everything back to usize as seen above.
Is casting between integers expensive? Does it have an impact on runtime performance at all?
Are there other ways to handle index arithmetics easier / more efficiently?
It depends
It's basically impossible to answer your question in isolation. These types of low-level things can be aggressively combined with operations that have to happen anyway, so any amount of inlining can change the behavior. Additionally, it strongly depends on your processor; changing to a 64-bit number on an 8-bit microcontroller is probably pretty expensive!
My general advice is to not worry. Keep your types consistent, get the right answers, then profile your code and fix the issues you find.
Pragmatically, what are you going to do instead?
That said, here's some concrete stuff for x86-64 and Rust 1.18.0.
Same size, changing sign
Basically no impact. If these were inlined, then you probably would never even see any assembly.
#[inline(never)]
pub fn signed_to_unsigned(i: isize) -> usize {
i as usize
}
#[inline(never)]
pub fn unsigned_to_signed(i: usize) -> isize {
i as isize
}
Each generates the assembly
movq %rdi, %rax
retq
Extending a value
These have to sign- or zero-extend the value, so some kind of minimal operation has to occur to fill those extra bits:
#[inline(never)]
pub fn u8_to_u64(i: u8) -> u64 {
i as u64
}
#[inline(never)]
pub fn i8_to_i64(i: i8) -> i64 {
i as i64
}
Generates the assembly
movzbl %dil, %eax
retq
movsbq %dil, %rax
retq
Truncating a value
Truncating is again just another move, basically no impact.
#[inline(never)]
pub fn u64_to_u8(i: u64) -> u8 {
i as u8
}
#[inline(never)]
pub fn i64_to_i8(i: i64) -> i8 {
i as i8
}
Generates the assembly
movl %edi, %eax
retq
movl %edi, %eax
retq
All these operations boil down to a single instruction on x86-64. Then you get into complications around "how long does an operation take" and that's even harder.

Lifetime differences between references to zero sized types

I came across an interesting case while playing with zero sized types (ZSTs). A reference to an empty array will mold to a reference with any lifetime:
fn mold_slice<'a, T>(_: &'a T) -> &'a [T] {
&[]
}
I thought about how that is possible, since basically the "value" here lives on the stack frame of the function, yet the signature promises to return a reference to a value with a longer lifetime ('a contains the function call). I came to the conclusion that it is because the empty array [] is a ZST which basically only exists statically. The compiler can "fake" the value the reference refers to.
So I tried this:
fn mold_unit<'a, T>(_: &'a T) -> &'a () {
&()
}
and then the compiler complained:
error: borrowed value does not live long enough
--> <anon>:7:6
|
7 | &()
| ^^ temporary value created here
8 | }
| - temporary value only lives until here
|
note: borrowed value must be valid for the lifetime 'a as defined on the block at 6:40...
--> <anon>:6:41
|
6 | fn mold_unit<'a, T>(_: &'a T) -> &'a () {
| ^
It doesn't work for the unit () type, and it also does not work for an empty struct:
struct Empty;
// fails to compile as well
fn mold_struct<'a, T>(_: &'a T) -> &'a Empty {
&Empty
}
Somehow, the unit type and the empty struct are treated differently from the empty array. Are there any additional differences between those values besides just being ZSTs? Do the differences (&[] fitting any lifetime and &(), &Empty not) nothing to do with ZSTs at all?
Playground example
It's not that [] is zero-sized (though it is), it's that [] is a constant, compile-time literal. This means the compiler can store it in the executable, rather than having to allocate it dynamically on the heap or stack. This, in turn, means that pointers to it last as long as they want, because data in the executable isn't going anywhere.
Annoyingly, this doesn't extend to something like &[0], because Rust isn't quite smart enough to realise that [0] is definitely constant. You can work around this by using something like:
fn mold_slice<'a, T>(_: &'a T) -> &'a [i32] {
const C: &'static [i32] = &[0];
C
}
This trick also works with anything you can put in a const, like () or Empty.
Realistically, however, it'd be simpler to just have functions like this return a &'static borrow, since that can be coerced to any other lifetime automatically.
Edit: the previous version noted that &[] is not zero sized, which was a little tangential.
Do the differences (&[] fitting any lifetime and &(), &Empty not) nothing to do with ZSTs at all?
I think this is exactly the case. The compiler probably just treats arrays differently and there is no deeper reasoning behind it.
The only difference that could play a role is that &[] is a fat pointer, consisting of the data pointer and a length. This fat pointer itself expresses the fact that there is actually no data behind it (because length=0). &() on the other hand is just a normal pointer. Here, only the type system expresses the fact that it's not pointing to anything real. But I'm just guessing here.
To clarify: a referencing fitting any lifetime means that the reference has the 'static lifetime. So instead of introducing some lifetime 'a, we can just return a static reference and will have the same effect (&[] works, the others don't).
There is an RFC which specifies that references to constexpr rvalues will be stored in the static data section of the executable, instead of the stack. After this RFC has been implemented (tracking issue), all of your example will compile, as [], () and Empty are constexpr rvalues. References to it will always be 'static. But the important part of the RFC is that it works for non-ZSTs, too: e.g. &27 has the type &'static i32.
To have some fun, let's look at the generated assembly (I used the amazing Compiler Explorer)! First let's try the working version:
pub fn mold_slice() -> &'static [i32] {
&[]
}
Using the -O flag (meaning: optimizations enabled; I checked the unoptimized version, too, and it doesn't have significant differences), this is compiled down to:
mold_slice:
push rbp
mov rbp, rsp
lea rax, [rip + ref.0]
xor edx, edx
pop rbp
ret
ref.0:
The fat pointer is returned in the rax (data pointer) and rdx (length) registers. As you can see, the length is set to 0 (xor edx, edx) and the data pointer is set to this mysterious ref.0. The ref.0 is not actually referencing anything at all. It's just an empty marker. This means we return just some pointer to the data section.
Now let's just tell the compiler to trust us on &() in order to compile it:
pub fn possibly_broken() -> &'static () {
unsafe { std::mem::transmute(&()) }
}
Result:
possibly_broken:
push rbp
mov rbp, rsp
lea rax, [rip + ref.1]
pop rbp
ret
ref.1:
Wow, we pretty much see the same result! The pointer (returned via rax) points somewhere to the data section. So it actually is a 'static reference after code generation. Only the lifetime checker doesn't quite know that and still refuses to compile the code. Well... I guess this is nothing dramatic, especially since the RFC mentioned above will fix that in near future.

Resources