In my example below does cons.push(...) ever copy the self parameter?
Or is rustc intelligent enough to realize that the values coming from lines #a and #b can always use the same stack space and no copying needs to occur (except for the obvious i32 copies)?
In other words, does a call to Cons.push(self, ...) always create a copy of self as ownership is being moved? Or does the self struct always stay in place on the stack?
References to documentation would be appreciated.
#[derive(Debug)]
struct Cons<T, U>(T, U);
impl<T, U> Cons<T, U> {
fn push<V>(self, value: V) -> Cons<Self, V> {
Cons(self, value)
}
}
fn main() {
let cons = Cons(1, 2); // #a
let cons = cons.push(3); // #b
println!("{:?}", cons); // #c
}
The implication in my example above is whether or not the push(...) function grows more expensive to call each time we add a line like #b at the rate of O(n^2) (if self is copied each time) or at the rate of O(n) (if self stays in place).
I tried implementing the Drop trait and noticed that both #a and #b were dropped after #c. To me this seems to indicate that self stays in place in this example, but I'm not 100%.
In general, trust in the compiler! Rust + LLVM is a very powerful combination that often produces surprisingly efficient code. And it will improve even more in time.
In other words, does a call to Cons.push(self, ...) always create a copy of self as ownership is being moved? Or does the self struct always stay in place on the stack?
self cannot stay in place because the new value returned by the push method has type Cons<Self, V>, which is essentially a tuple of Self and V. Although tuples don't have any memory layout guarantees, I strongly believe they can't have their elements scattered arbitrarily in memory. Thus, self and value must both be moved into the new structure.
Above paragraph assumed that self was placed firmly on the stack before calling push. The compiler actually has enough information to know it should reserve enough space for the final structure. Especially with function inlining this becomes a very likely optimization.
The implication in my example above is whether or not the push(...) function grows more expensive to call each time we add a line like #b at the rate of O(n^2) (if self is copied each time) or at the rate of O(n) (if self stays in place).
Consider two functions (playground):
pub fn push_int(cons: Cons<i32, i32>, x: i32) -> Cons<Cons<i32, i32>, i32> {
cons.push(x)
}
pub fn push_int_again(
cons: Cons<Cons<i32, i32>, i32>,
x: i32,
) -> Cons<Cons<Cons<i32, i32>, i32>, i32> {
cons.push(x)
}
push_int adds a third element to a Cons and push_int_again adds a fourth element.
push_int compiles to the following assembly in Release mode:
movq %rdi, %rax
movl %esi, (%rdi)
movl %edx, 4(%rdi)
movl %ecx, 8(%rdi)
retq
And push_int_again compiles to:
movq %rdi, %rax
movl 8(%rsi), %ecx
movl %ecx, 8(%rdi)
movq (%rsi), %rcx
movq %rcx, (%rdi)
movl %edx, 12(%rdi)
retq
You don't need to understand assembly to see that pushing the fourth element requires more instructions than pushing the third element.
Note that this observation was made for these functions in isolation. Calls like cons.push(x).push(y).push(...) are inlined and the assembly grows linearly with one instruction per push.
The ownership of cons in #a type Cons will be transferred in push(). Again the ownership will be transferred to Cons<Cons,i32>(Cons<T,U>) type which is shadowed variable cons in #b.
If struct Cons implement Copy, Clone traits it will be copy. Otherwise no copy and you cannot use the original vars after they are moved (or owned) by someone else.
Move semantics:
let cons = Cons(1, 2); //Cons(1,2) as resource in memory being pointed by cons
let cons2 = cons; // Cons(1,2) now pointed by cons2. Problem! as cons also point it. Lets prevent access from cons
println!("{:?}", cons); //error because cons is moved
Related
In Rust, there are two possibilities to take a reference
Borrow, i.e., take a reference but don't allow mutating the reference destination. The & operator borrows ownership from a value.
Borrow mutably, i.e., take a reference to mutate the destination. The &mut operator mutably borrows ownership from a value.
The Rust documentation about borrowing rules says:
First, any borrow must last for a scope no greater than that of the
owner. Second, you may have one or the other of these two kinds of
borrows, but not both at the same time:
one or more references (&T) to a resource,
exactly one mutable reference (&mut T).
I believe that taking a reference is creating a pointer to the value and accessing the value by the pointer. This could be optimized away by the compiler if there is a simpler equivalent implementation.
However, I don't understand what move means and how it is implemented.
For types implementing the Copy trait it means copying e.g. by assigning the struct member-wise from the source, or a memcpy(). For small structs or for primitives this copy is efficient.
And for move?
This question is not a duplicate of What are move semantics? because Rust and C++ are different languages and move semantics are different between the two.
Semantics
Rust implements what is known as an Affine Type System:
Affine types are a version of linear types imposing weaker constraints, corresponding to affine logic. An affine resource can only be used once, while a linear one must be used once.
Types that are not Copy, and are thus moved, are Affine Types: you may use them either once or never, nothing else.
Rust qualifies this as a transfer of ownership in its Ownership-centric view of the world (*).
(*) Some of the people working on Rust are much more qualified than I am in CS, and they knowingly implemented an Affine Type System; however contrary to Haskell which exposes the math-y/cs-y concepts, Rust tends to expose more pragmatic concepts.
Note: it could be argued that Affine Types returned from a function tagged with #[must_use] are actually Linear Types from my reading.
Implementation
It depends. Please keep in mind than Rust is a language built for speed, and there are numerous optimizations passes at play here which will depend on the compiler used (rustc + LLVM, in our case).
Within a function body (playground):
fn main() {
let s = "Hello, World!".to_string();
let t = s;
println!("{}", t);
}
If you check the LLVM IR (in Debug), you'll see:
%_5 = alloca %"alloc::string::String", align 8
%t = alloca %"alloc::string::String", align 8
%s = alloca %"alloc::string::String", align 8
%0 = bitcast %"alloc::string::String"* %s to i8*
%1 = bitcast %"alloc::string::String"* %_5 to i8*
call void #llvm.memcpy.p0i8.p0i8.i64(i8* %1, i8* %0, i64 24, i32 8, i1 false)
%2 = bitcast %"alloc::string::String"* %_5 to i8*
%3 = bitcast %"alloc::string::String"* %t to i8*
call void #llvm.memcpy.p0i8.p0i8.i64(i8* %3, i8* %2, i64 24, i32 8, i1 false)
Underneath the covers, rustc invokes a memcpy from the result of "Hello, World!".to_string() to s and then to t. While it might seem inefficient, checking the same IR in Release mode you will realize that LLVM has completely elided the copies (realizing that s was unused).
The same situation occurs when calling a function: in theory you "move" the object into the function stack frame, however in practice if the object is large the rustc compiler might switch to passing a pointer instead.
Another situation is returning from a function, but even then the compiler might apply "return value optimization" and build directly in the caller's stack frame -- that is, the caller passes a pointer into which to write the return value, which is used without intermediary storage.
The ownership/borrowing constraints of Rust enable optimizations that are difficult to reach in C++ (which also has RVO but cannot apply it in as many cases).
So, the digest version:
moving large objects is inefficient, but there are a number of optimizations at play that might elide the move altogether
moving involves a memcpy of std::mem::size_of::<T>() bytes, so moving a large String is efficient because it only copies a couple bytes whatever the size of the allocated buffer they hold onto
When you move an item, you are transferring ownership of that item. That's a key component of Rust.
Let's say I had a struct, and then I assign the struct from one variable to another. By default, this will be a move, and I've transferred ownership. The compiler will track this change of ownership and prevent me from using the old variable any more:
pub struct Foo {
value: u8,
}
fn main() {
let foo = Foo { value: 42 };
let bar = foo;
println!("{}", foo.value); // error: use of moved value: `foo.value`
println!("{}", bar.value);
}
how it is implemented.
Conceptually, moving something doesn't need to do anything. In the example above, there wouldn't be a reason to actually allocate space somewhere and then move the allocated data when I assign to a different variable. I don't actually know what the compiler does, and it probably changes based on the level of optimization.
For practical purposes though, you can think that when you move something, the bits representing that item are duplicated as if via memcpy. This helps explain what happens when you pass a variable to a function that consumes it, or when you return a value from a function (again, the optimizer can do other things to make it efficient, this is just conceptually):
// Ownership is transferred from the caller to the callee
fn do_something_with_foo(foo: Foo) {}
// Ownership is transferred from the callee to the caller
fn make_a_foo() -> Foo { Foo { value: 42 } }
"But wait!", you say, "memcpy only comes into play with types implementing Copy!". This is mostly true, but the big difference is that when a type implements Copy, both the source and the destination are valid to use after the copy!
One way of thinking of move semantics is the same as copy semantics, but with the added restriction that the thing being moved from is no longer a valid item to use.
However, it's often easier to think of it the other way: The most basic thing that you can do is to move / give ownership away, and the ability to copy something is an additional privilege. That's the way that Rust models it.
This is a tough question for me! After using Rust for a while the move semantics are natural. Let me know what parts I've left out or explained poorly.
Rust's move keyword always bothers me so, I decided to write my understanding which I obtained after discussion with my colleagues.
I hope this might help someone.
let x = 1;
In the above statement, x is a variable whose value is 1. Now,
let y = || println!("y is a variable whose value is a closure");
So, move keyword is used to transfer the ownership of a variable to the closure.
In the below example, without move, x is not owned by the closure. Hence x is not owned by y and available for further use.
let x = 1;
let y = || println!("this is a closure that prints x = {}". x);
On the other hand, in this next below case, the x is owned by the closure. x is owned by y and not available for further use.
let x = 1;
let y = move || println!("this is a closure that prints x = {}". x);
By owning I mean containing as a member variable. The example cases above are in the same situation as the following two cases. We can also assume the below explanation as to how the Rust compiler expands the above cases.
The formar (without move; i.e. no transfer of ownership),
struct ClosureObject {
x: &u32
}
let x = 1;
let y = ClosureObject {
x: &x
};
The later (with move; i.e. transfer of ownership),
struct ClosureObject {
x: u32
}
let x = 1;
let y = ClosureObject {
x: x
};
Please let me answer my own question. I had trouble, but by asking a question here I did Rubber Duck Problem Solving. Now I understand:
A move is a transfer of ownership of the value.
For example the assignment let x = a; transfers ownership: At first a owned the value. After the let it's x who owns the value. Rust forbids to use a thereafter.
In fact, if you do println!("a: {:?}", a); after the letthe Rust compiler says:
error: use of moved value: `a`
println!("a: {:?}", a);
^
Complete example:
#[derive(Debug)]
struct Example { member: i32 }
fn main() {
let a = Example { member: 42 }; // A struct is moved
let x = a;
println!("a: {:?}", a);
println!("x: {:?}", x);
}
And what does this move mean?
It seems that the concept comes from C++11. A document about C++ move semantics says:
From a client code point of view, choosing move instead of copy means that you don't care what happens to the state of the source.
Aha. C++11 does not care what happens with source. So in this vein, Rust is free to decide to forbid to use the source after a move.
And how it is implemented?
I don't know. But I can imagine that Rust does literally nothing. x is just a different name for the same value. Names usually are compiled away (except of course debugging symbols). So it's the same machine code whether the binding has the name a or x.
It seems C++ does the same in copy constructor elision.
Doing nothing is the most efficient possible.
Passing a value to function, also results in transfer of ownership; it is very similar to other examples:
struct Example { member: i32 }
fn take(ex: Example) {
// 2) Now ex is pointing to the data a was pointing to in main
println!("a.member: {}", ex.member)
// 3) When ex goes of of scope so as the access to the data it
// was pointing to. So Rust frees that memory.
}
fn main() {
let a = Example { member: 42 };
take(a); // 1) The ownership is transfered to the function take
// 4) We can no longer use a to access the data it pointed to
println!("a.member: {}", a.member);
}
Hence the expected error:
post_test_7.rs:12:30: 12:38 error: use of moved value: `a.member`
let s1:String= String::from("hello");
let s2:String= s1;
To ensure memory safety, rust invalidates s1, so instead of being shallow copy, this called a Move
fn main() {
// Each value in rust has a variable that is called its owner
// There can only be one owner at a time.
let s=String::from('hello')
take_ownership(s)
println!("{}",s)
// Error: borrow of moved value "s". value borrowed here after move. so s cannot be borrowed after a move
// when we pass a parameter into a function it is the same as if we were to assign s to another variable. Passing 's' moves s into the 'my_string' variable then `println!("{}",my_string)` executed, "my_string" printed out. After this scope is done, some_string gets dropped.
let x:i32 = 2;
makes_copy(x)
// instead of being moved, integers are copied. we can still use "x" after the function
//Primitives types are Copy and they are stored in stack because there size is known at compile time.
println("{}",x)
}
fn take_ownership(my_string:String){
println!('{}',my_string);
}
fn makes_copy(some_integer:i32){
println!("{}", some_integer)
}
For shared references and mutable references the semantics are clear: as
long as you have a shared reference to a value, nothing else must have
mutable access, and a mutable reference can't be shared.
So this code:
#[no_mangle]
pub extern fn run_ref(a: &i32, b: &mut i32) -> (i32, i32) {
let x = *a;
*b = 1;
let y = *a;
(x, y)
}
compiles (on x86_64) to:
run_ref:
movl (%rdi), %ecx
movl $1, (%rsi)
movq %rcx, %rax
shlq $32, %rax
orq %rcx, %rax
retq
Note that the memory a points to is only read once, because the
compiler knows the write to b must not have modified the memory at
a.
Raw pointer are more complicated. Raw pointer arithmetic and casts are
"safe", but dereferencing them is not.
We can convert raw pointers back to shared and mutable references, and
then use them; this will certainly imply the usual reference semantics,
and the compiler can optimize accordingly.
But what are the semantics if we use raw pointers directly?
#[no_mangle]
pub unsafe extern fn run_ptr_direct(a: *const i32, b: *mut f32) -> (i32, i32) {
let x = *a;
*b = 1.0;
let y = *a;
(x, y)
}
compiles to:
run_ptr_direct:
movl (%rdi), %ecx
movl $1065353216, (%rsi)
movl (%rdi), %eax
shlq $32, %rax
orq %rcx, %rax
retq
Although we write a value of different type, the second read still goes
to memory - it seems to be allowed to call this function with the same
(or overlapping) memory location for both arguments. In other words, a
const raw pointer does not forbid a coexisting mut raw pointer; and
its probably fine to have two mut raw pointers (of possibly different
types) to the same (or overlapping) memory location too.
Note that a normal optimizing C/C++-compiler would eliminate the second
read (due to the "strict aliasing" rule: modfying/reading the same
memory location through pointers of different ("incompatible") types is
UB in most cases):
struct tuple { int x; int y; };
extern "C" tuple run_ptr(int const* a, float* b) {
int const x = *a;
*b = 1.0;
int const y = *a;
return tuple{x, y};
}
compiles to:
run_ptr:
movl (%rdi), %eax
movl $0x3f800000, (%rsi)
movq %rax, %rdx
salq $32, %rdx
orq %rdx, %rax
ret
Playground with Rust code examples
godbolt Compiler Explorer with C example
So: What are the semantics if we use raw pointers directly: is it ok for
referenced data to overlap?
This should have direct implications on whether the compiler is allowed
to reorder memory access through raw pointers.
No awkward strict-aliasing here
C++ strict-aliasing is a patch on a wooden leg. C++ does not have any aliasing information, and the absence of aliasing information prevents a number of optimizations (as you noted here), therefore to regain some performance strict-aliasing was patched on...
Unfortunately, strict-aliasing is awkward in a systems language, because reinterpreting raw-memory is the essence of what systems language are designed to do.
And doubly unfortunately it does not enable that many optimizations. For example, copying from one array to another must assume that the arrays may overlap.
restrict (from C) is a bit more helpful, although it only applies to one level at a time.
Instead, we have scope-based aliasing analysis
The essence of the aliasing analysis in Rust is based on lexical scopes (barring threads).
The beginner level explanation that you probably know is:
if you have a &T, then there is no &mut T to the same instance,
if you have a &mut T, then there is no &T or &mut T to the same instance.
As suited to a beginner, it is a slightly abbreviated version. For example:
fn main() {
let mut i = 32;
let mut_ref = &mut i;
let x: &i32 = mut_ref;
println!("{}", x);
}
is perfectly fine, even though both a &mut i32 (mut_ref) and a &i32 (x) point to the same instance!
If you try to access mut_ref after forming x, however, the truth is unveiled:
fn main() {
let mut i = 32;
let mut_ref = &mut i;
let x: &i32 = mut_ref;
*mut_ref = 2;
println!("{}", x);
}
error[E0506]: cannot assign to `*mut_ref` because it is borrowed
|
4 | let x: &i32 = mut_ref;
| ------- borrow of `*mut_ref` occurs here
5 | *mut_ref = 2;
| ^^^^^^^^^^^^ assignment to borrowed `*mut_ref` occurs here
So, it is fine to have both &mut T and &T pointing to the same memory location at the same time; however mutating through the &mut T will be disabled for as long as the &T exists.
In a sense, the &mut T is temporarily downgraded to a &T.
So, what of pointers?
First of all, let's review the reference:
are not guaranteed to point to valid memory and are not even guaranteed to be non-NULL (unlike both Box and &);
do not have any automatic clean-up, unlike Box, and so require manual resource management;
are plain-old-data, that is, they don't move ownership, again unlike Box, hence the Rust compiler cannot protect against bugs like use-after-free;
lack any form of lifetimes, unlike &, and so the compiler cannot reason about dangling pointers; and
have no guarantees about aliasing or mutability other than mutation not being allowed directly through a *const T.
Conspicuously absent is any rule forbidding from casting a *const T to a *mut T. That's normal, it's allowed, and therefore the last point is really more of a lint, since it can be so easily worked around.
Nomicon
A discussion of unsafe Rust would not be complete without pointing to the Nomicon.
Essentially, the rules of unsafe Rust are rather simple: uphold whatever guarantee the compiler would have if it was safe Rust.
This is not as helpful as it could be, since those rules are not set in stone yet; sorry.
Then, what are the semantics for dereferencing raw pointers?
As far as I know1:
if you form a reference from the raw pointer (&T or &mut T) then you must ensure that the aliasing rules these references obey are upheld,
if you immediately read/write, this temporarily forms a reference.
That is, providing that the caller had mutable access to the location:
pub unsafe fn run_ptr_direct(a: *const i32, b: *mut f32) -> (i32, i32) {
let x = *a;
*b = 1.0;
let y = *a;
(x, y)
}
should be valid, because *a has type i32, so there is no overlap of lifetime in references.
However, I would expect:
pub unsafe fn run_ptr_modified(a: *const i32, b: *mut f32) -> (i32, i32) {
let x = &*a;
*b = 1.0;
let y = *a;
(*x, y)
}
To be undefined behavior, because x would be live while *b is used to modify its memory.
Note how subtle the change is. It's easy to break invariants in unsafe code.
1 And I might be wrong right now, or I may become wrong in the future
I am working on a project where I am doing a lot of index-based calculation. I have a few lines like:
let mut current_x: usize = (start.x as isize + i as isize * delta_x) as usize;
start.x and i are usizes and delta_x is of type isize. Most of my data is unsigned, therefore storing it signed would not make much sense. On the other hand, when I manipulate an array I am accessing a lot I have to convert everything back to usize as seen above.
Is casting between integers expensive? Does it have an impact on runtime performance at all?
Are there other ways to handle index arithmetics easier / more efficiently?
It depends
It's basically impossible to answer your question in isolation. These types of low-level things can be aggressively combined with operations that have to happen anyway, so any amount of inlining can change the behavior. Additionally, it strongly depends on your processor; changing to a 64-bit number on an 8-bit microcontroller is probably pretty expensive!
My general advice is to not worry. Keep your types consistent, get the right answers, then profile your code and fix the issues you find.
Pragmatically, what are you going to do instead?
That said, here's some concrete stuff for x86-64 and Rust 1.18.0.
Same size, changing sign
Basically no impact. If these were inlined, then you probably would never even see any assembly.
#[inline(never)]
pub fn signed_to_unsigned(i: isize) -> usize {
i as usize
}
#[inline(never)]
pub fn unsigned_to_signed(i: usize) -> isize {
i as isize
}
Each generates the assembly
movq %rdi, %rax
retq
Extending a value
These have to sign- or zero-extend the value, so some kind of minimal operation has to occur to fill those extra bits:
#[inline(never)]
pub fn u8_to_u64(i: u8) -> u64 {
i as u64
}
#[inline(never)]
pub fn i8_to_i64(i: i8) -> i64 {
i as i64
}
Generates the assembly
movzbl %dil, %eax
retq
movsbq %dil, %rax
retq
Truncating a value
Truncating is again just another move, basically no impact.
#[inline(never)]
pub fn u64_to_u8(i: u64) -> u8 {
i as u8
}
#[inline(never)]
pub fn i64_to_i8(i: i64) -> i8 {
i as i8
}
Generates the assembly
movl %edi, %eax
retq
movl %edi, %eax
retq
All these operations boil down to a single instruction on x86-64. Then you get into complications around "how long does an operation take" and that's even harder.
I came across an interesting case while playing with zero sized types (ZSTs). A reference to an empty array will mold to a reference with any lifetime:
fn mold_slice<'a, T>(_: &'a T) -> &'a [T] {
&[]
}
I thought about how that is possible, since basically the "value" here lives on the stack frame of the function, yet the signature promises to return a reference to a value with a longer lifetime ('a contains the function call). I came to the conclusion that it is because the empty array [] is a ZST which basically only exists statically. The compiler can "fake" the value the reference refers to.
So I tried this:
fn mold_unit<'a, T>(_: &'a T) -> &'a () {
&()
}
and then the compiler complained:
error: borrowed value does not live long enough
--> <anon>:7:6
|
7 | &()
| ^^ temporary value created here
8 | }
| - temporary value only lives until here
|
note: borrowed value must be valid for the lifetime 'a as defined on the block at 6:40...
--> <anon>:6:41
|
6 | fn mold_unit<'a, T>(_: &'a T) -> &'a () {
| ^
It doesn't work for the unit () type, and it also does not work for an empty struct:
struct Empty;
// fails to compile as well
fn mold_struct<'a, T>(_: &'a T) -> &'a Empty {
&Empty
}
Somehow, the unit type and the empty struct are treated differently from the empty array. Are there any additional differences between those values besides just being ZSTs? Do the differences (&[] fitting any lifetime and &(), &Empty not) nothing to do with ZSTs at all?
Playground example
It's not that [] is zero-sized (though it is), it's that [] is a constant, compile-time literal. This means the compiler can store it in the executable, rather than having to allocate it dynamically on the heap or stack. This, in turn, means that pointers to it last as long as they want, because data in the executable isn't going anywhere.
Annoyingly, this doesn't extend to something like &[0], because Rust isn't quite smart enough to realise that [0] is definitely constant. You can work around this by using something like:
fn mold_slice<'a, T>(_: &'a T) -> &'a [i32] {
const C: &'static [i32] = &[0];
C
}
This trick also works with anything you can put in a const, like () or Empty.
Realistically, however, it'd be simpler to just have functions like this return a &'static borrow, since that can be coerced to any other lifetime automatically.
Edit: the previous version noted that &[] is not zero sized, which was a little tangential.
Do the differences (&[] fitting any lifetime and &(), &Empty not) nothing to do with ZSTs at all?
I think this is exactly the case. The compiler probably just treats arrays differently and there is no deeper reasoning behind it.
The only difference that could play a role is that &[] is a fat pointer, consisting of the data pointer and a length. This fat pointer itself expresses the fact that there is actually no data behind it (because length=0). &() on the other hand is just a normal pointer. Here, only the type system expresses the fact that it's not pointing to anything real. But I'm just guessing here.
To clarify: a referencing fitting any lifetime means that the reference has the 'static lifetime. So instead of introducing some lifetime 'a, we can just return a static reference and will have the same effect (&[] works, the others don't).
There is an RFC which specifies that references to constexpr rvalues will be stored in the static data section of the executable, instead of the stack. After this RFC has been implemented (tracking issue), all of your example will compile, as [], () and Empty are constexpr rvalues. References to it will always be 'static. But the important part of the RFC is that it works for non-ZSTs, too: e.g. &27 has the type &'static i32.
To have some fun, let's look at the generated assembly (I used the amazing Compiler Explorer)! First let's try the working version:
pub fn mold_slice() -> &'static [i32] {
&[]
}
Using the -O flag (meaning: optimizations enabled; I checked the unoptimized version, too, and it doesn't have significant differences), this is compiled down to:
mold_slice:
push rbp
mov rbp, rsp
lea rax, [rip + ref.0]
xor edx, edx
pop rbp
ret
ref.0:
The fat pointer is returned in the rax (data pointer) and rdx (length) registers. As you can see, the length is set to 0 (xor edx, edx) and the data pointer is set to this mysterious ref.0. The ref.0 is not actually referencing anything at all. It's just an empty marker. This means we return just some pointer to the data section.
Now let's just tell the compiler to trust us on &() in order to compile it:
pub fn possibly_broken() -> &'static () {
unsafe { std::mem::transmute(&()) }
}
Result:
possibly_broken:
push rbp
mov rbp, rsp
lea rax, [rip + ref.1]
pop rbp
ret
ref.1:
Wow, we pretty much see the same result! The pointer (returned via rax) points somewhere to the data section. So it actually is a 'static reference after code generation. Only the lifetime checker doesn't quite know that and still refuses to compile the code. Well... I guess this is nothing dramatic, especially since the RFC mentioned above will fix that in near future.
Rust employs dynamic checking methods to check many things. One such example is bounds checking of arrays.
Take this code for example,
fn test_dynamic_checking() -> i32 {
let x = [1, 2, 3, 4];
x[1]
}
The resulting LLVM IR is:
; Function Attrs: uwtable
define internal i32 #_ZN10dynamic_ck21test_dynamic_checking17hcef32a1e8c339e2aE() unnamed_addr #0 {
entry-block:
%x = alloca [5 x i32]
%0 = bitcast [5 x i32]* %x to i8*
call void #llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* bitcast ([5 x i32]* #const7091 to i8*), i64 20, i32 4, i1 false)
%1 = getelementptr inbounds [5 x i32], [5 x i32]* %x, i32 0, i32 0
%2 = call i1 #llvm.expect.i1(i1 false, i1 false)
br i1 %2, label %cond, label %next
next: ; preds = %entry-block
%3 = getelementptr inbounds i32, i32* %1, i64 1
%4 = load i32, i32* %3
ret i32 %4
cond: ; preds = %entry-block
call void #_ZN4core9panicking18panic_bounds_check17hcc71f10000bd8e6fE({ %str_slice, i32 }* noalias readonly dereferenceable(24) #panic_bounds_check_loc7095, i64 1, i64 5)
unreachable
}
A branch instruction is inserted to decide whether the index is out of bounds or not, which doesn't exist in clang-compiled LLVM IR.
Here are my questions:
In what situations does Rust implement dynamic checking?
How does Rust implement dynamic checking in different situations?
Is there any way to turn off the dynamic checking?
Conceptually, Rust performs array bound checking on each and every array access. However, the compiler is very good at optimizing away the checks when it can prove that it's safe to do so.
The LLVM intermediate output is misleading because it still undergoes optimizations by the LLVM's optimizing machinery before the machine assembly is generated. A better way to inspect assembly output is by generating the final assembly using an invocation such as rustc -O --emit asm --crate-type=lib. The assembly output for your function is just:
push rbp
mov rbp, rsp
mov eax, 2
pop rbp
ret
Not only is there no bound checking in sight, there is no array to begin with, the compiler has optimized the entire function to a return 2i32! To force bound checking, the function needs to be written so that Rust cannot prove that it can be elided:
pub fn test_dynamic_checking(ind: usize) -> i32 () {
let x = [1, 2, 3, 4];
x[ind]
}
This results in a larger assembly, where the bound check is implemented as the following two instructions:
cmp rax, 3 ; compare index with 3
ja .LBB0_2 ; if greater, jump to panic code
That is as efficient as it gets. Turning off the bound check is rarely a good idea because it can easily cause the program to crash. It can be done, but explicitly and only within the confines of an unsafe block or function:
pub unsafe fn test_dynamic_checking(ind: usize) -> i32 () {
let x = [1, 2, 3, 4];
*x.get_unchecked(ind)
}
The generated assembly shows the error checking to be entirely omitted:
push rbp
mov rbp, rsp
lea rax, [rip + const3141]
mov eax, dword ptr [rax + 4*rdi]
pop rbp
ret
const3141:
.long 1
.long 2
.long 3
.long 4
In what situations does Rust implement dynamic checking?
It's a bit of a cop out... but Rust, the language, does not implement any dynamic checking. The Rust libraries, however, starting with the core and std libraries do.
A library needs to use run-time checking whenever not doing so could lead to a memory safety issue. Examples include:
guaranteeing that an index is within bounds, such as when implementing Index
guaranteeing that no other reference to an object exists, such as when implementing RefCell
...
In general, the Rust philosophy is to perform as many checks as possible at compile-time, but some checks need by delayed to run-time because they depend on dynamic behavior/values.
How does Rust implement dynamic checking in different situations?
As efficiently as possible.
When dynamic checking is required, the Rust code will be crafted to be as efficient as possible. This can be slightly complicated, as it involves trying to ease the work of the optimizer and conforming to patterns that the optimizer recognizes (or fixing it), but we have the chance that a few developers are obsessed with performance (#bluss, for example).
Not all checks may be elided, of course, and those that remain will typically be just a branch.
Is there any way to turn off the dynamic checking?
Whenever dynamic checking is necessary to guarantee code safety, it is not possible to turn in off in safe code.
In some situations, though, an unsafe block or function may allow to bypass the check (for example get_unchecked for indexing).
This is not recommended in general, and should be a last-resort behaviour:
Most of the times, the run-time checks have little to no performance impacts; CPU prediction is awesome like that.
Even if they have some impact, unless they sit it a very hot-loop, it may not be worth optimizing them.
Finally, if it does NOT optimize, it is worth trying to understand why: maybe it's impossible or maybe the optimizer is not clever enough... in the latter case, reporting the issue is the best way to have someone digging into the optimizer and fixing it (if you cannot or are not willing to do it yourself).