I have this piece of simple code:
let val: u8 = 255 + 1;
println!("{}", val);
It is said here that such a code will compile normally if run with the --release flag.
I am running this code via cargo run --release, and I still see the checks:
error: this arithmetic operation will overflow
--> src/main.rs:2:19
|
2 | let val: u8 = 255 + 1;
| ^^^^^^^ attempt to compute `u8::MAX + 1_u8`, which would overflow
|
= note: `#[deny(arithmetic_overflow)]` on by default
error: could not compile `rust-bin` due to previous error
Am I missing something?
The book is slightly imprecise. Overflow is disallowed in both debug and release modes, it's just that release mode omits runtime checks for performance reasons (replacing them with overflow, which CPUs typically do anyway). Static checks are not removed because they don't compromise on performance of generated code. This prints 0 in release mode and panics in debug1:
let x: u8 = "255".parse().unwrap();
let val: u8 = x + 1;
println!("{}", val);
You can disable the compile-time checks using #[allow(arithmetic_overflow)]. This also prints 0 in release mode and panics in debug:
#[allow(arithmetic_overflow)]
let val: u8 = 255 + 1;
println!("{}", val);
The correct approach is to not depend on this behavior of release mode, but to tell the compiler what you want. This prints 0 in both debug and release mode:
let val: u8 = 255u8.wrapping_add(1);
println!("{}", val);
1
The example uses "255".parse() because, to my surprise, let x = 255u8; let val = x + 1; doesn't compile - in other words, rustc doesn't just prevent overflow in constant arithmetic, but also wherever else it can prove it happens. The change was apparently made in Rust 1.45, because it compiled in Rust 1.44 and older. Since it broke code that previously compiled, the change was technically backward-incompatible, but presumably broke sufficiently few actual crates that it was deemed worth it. Surprising as it is, it's quite possible that "255".parse::<u8>() + 1 will become a compile-time error in a later release.
In your code, the compiler is able to detect the problem. That's why it prevents it even in release mode. In many cases it's not possible or feasible for the compiler to detect or prevent an error.
Just as an example, imagine you have code like this:
let a = b + 5;
Let's say b's value comes from a database, user input or some other external source. It is literally impossible to prevent overflows in cases like that.
Related
In Rust, this code is valid :
let signedInt: i32 = 23*-1;
However, this is not :
let unsignedInt: u16 = 2;
let signedInt: i32 = unsignedInt*-1;
Which makes sense, as Rust tries to interpret -1 as if it were of the same type as unsignedInt.
So casting is needed. However, said casting becomes quite ugly when using more types :
-((unsignedInt*320) as f32)
Doing this is needed, as -(unsignedInt*320) is an invalid expression. But the code above is basically unreadable, and I was wondering what was the best way of making it both valid Rust and human-readable.
Rust requires explicit casts because it's a common source of bugs in other languages like C. Generally you should avoid as, and use from or into instead if possible, otherwise try_from/try_into. The main exception is int<->float casts which are only possible with as at the moment.
Because all numbers in u16 can be represented in i32, your second example can be written as:
let unsignedInt: u16 = 2;
let signedInt = i32::from(unsignedInt) * -1;
Your third example must still be written with an as cast, but you can leave out the variable type:
let unsignedInt: u16 = 2;
let float = -((unsignedInt * 320) as f32);
I thought that once an object is moved, the memory occupied by it on the stack can be reused for other purpose. However, the minimal example below shows the opposite.
#[inline(never)]
fn consume_string(s: String) {
drop(s);
}
fn main() {
println!(
"String occupies {} bytes on the stack.",
std::mem::size_of::<String>()
);
let s = String::from("hello");
println!("s at {:p}", &s);
consume_string(s);
let r = String::from("world");
println!("r at {:p}", &r);
consume_string(r);
}
After compiling the code with --release flag, it gives the following output on my computer.
String occupies 24 bytes on the stack.
s at 0x7ffee3b011b0
r at 0x7ffee3b011c8
It is pretty clear that even if s is moved, r does not reuse the 24-byte chunk on the stack that originally belonged to s. I suppose that reusing the stack memory of a moved object is safe, but why does the Rust compiler not do it? Am I missing any corner case?
Update:
If I enclose s by curly brackets, r can reuse the 24-byte chunk on the stack.
#[inline(never)]
fn consume_string(s: String) {
drop(s);
}
fn main() {
println!(
"String occupies {} bytes on the stack.",
std::mem::size_of::<String>()
);
{
let s = String::from("hello");
println!("s at {:p}", &s);
consume_string(s);
}
let r = String::from("world");
println!("r at {:p}", &r);
consume_string(r);
}
The code above gives the output below.
String occupies 24 bytes on the stack.
s at 0x7ffee2ca31f8
r at 0x7ffee2ca31f8
I thought that the curly brackets should not make any difference, because the lifetime of s ends after calling comsume_string(s) and its drop handler is called within comsume_string(). Why does adding the curly brackets enable the optimization?
The version of the Rust compiler I am using is given below.
rustc 1.54.0-nightly (5c0292654 2021-05-11)
binary: rustc
commit-hash: 5c029265465301fe9cb3960ce2a5da6c99b8dcf2
commit-date: 2021-05-11
host: x86_64-apple-darwin
release: 1.54.0-nightly
LLVM version: 12.0.1
Update 2:
I would like to clarify my focus of this question. I want to know the proposed "stack reuse optimization" lies in which category.
This is an invalid optimization. Under certain cases the compiled code may fail if we perform the "optimization".
This is a valid optimization, but the compiler (including both rustc frontend and llvm) is not capable of performing it.
This is a valid optimization, but is temporarily turned off, like this.
This is a valid optimization, but is missed. It will be added in the future.
My TLDR conclusion: A missed optimization opportunity.
So the first thing I did was look into whether your consume_string function actually makes a difference. To do this I created the following (a bit more) minimal example:
struct Obj([u8; 8]);
fn main()
{
println!(
"Obj occupies {} bytes on the stack.",
std::mem::size_of::<Obj>()
);
let s = Obj([1,2,3,4,5,6,7,8]);
println!("{:p}", &s);
std::mem::drop(s);
let r = Obj([11,12,13,14,15,16,17,18]);
println!("{:p}", &r);
std::mem::drop(r);
}
Instead of consume_string I use std::mem::drop which is dedicated to simply consuming an object. This code behaves just like yours:
Obj occupies 8 bytes on the stack.
0x7ffe81a43fa0
0x7ffe81a43fa8
Removing the drop doesn't affect the result.
So the question is then why rustc doesn't notice that s is dead before r goes live. As your second example shows, enclosing s in a scope will allow the optimization.
Why does this work? Because the Rust semantics dictate that an object is dropped at the end of its scope. Since s is in an inner scope, it is dropped before the scope exits. Without the scope, s is alive until the main function exits.
Why doesn't it work when moving s into a function, where it should be dropped on exit?
Probably because rust doesn't correctly flag the memory location used by s as free after the function call. As has been mentioned in the comments, it is LLVM that actually handles this optimization (called 'Stack Coloring' as far as I can tell), which means rustc must correctly tell it when the memory is no longer in use. Clearly, from your last example, rustc does it on scope exit, but apparently not when an object is moved.
i think the fn drop do not free the memory of S, just call the fn drop.
in first case the s still use the stack memory, rust can not be reused.
in second case, because the {} scope, the memory is free. so the stack memory reused
I've been stepping through the Programming Rust book and wanted to observe the two's complement wrapping, so simple code of:
fn main() {
let mut x: u8 = 255;
println!("the value of x is {}", x) ;
x = 255 + 1 ;
println!("The value of x now is {}",x) ;
}
when I try and compile this with Cargo as per the guide, I run
cargo build --release
which in the book says will let it compile without overflow protection, but it won't compile. I get the protection error
|
6 | x = 255 + 1 ;
| ^^^^^^^^^^^ attempt to compute u8::MAX + 1_u8, which would overflow
Can you explain what I'm doing wrong please ?
I believe the value is not checked dynamically during run-time (it wont panic and would overflow) but still statically checked for (if possible) during compile time.
In this case the compiler is able to determine at compile time what you're trying to do and prevents you from doing it.
That being said if you look at the compiler output you can see the following message:
note: #[deny(arithmetic_overflow)] on by default
You'll see this message regardless of the optimization level.
If you'd like to observe the overflow put the following inner attribute at the top of your source file.
#![allow(arithmetic_overflow)]
Or, if you're compiling with rustc directly you can pass the following flags:
-O -A arithmetic_overflow
The rustc docs show that the following lints are on by default (regardless of optimization level)
ambiguous_associated_items
arithmetic_overflow
conflicting_repr_hints
const_err
ill_formed_attribute_input
incomplete_include
invalid_type_param_default
macro_expanded_macro_exports_accessed_by_absolute_paths
missing_fragment_specifier
mutable_transmutes
no_mangle_const_items
order_dependent_trait_objects
overflowing_literals
patterns_in_fns_without_body
pub_use_of_private_extern_crate
soft_unstable
unconditional_panic
unknown_crate_types
useless_deprecated
When you write a literal 255+1 in your code, the compiler evaluates the expression at compile-time and sees the overflow immediately, whether in debug or release mode. When the book says that --release disables overflow protection, it's talking about runtime checks. You can see the difference with this code:
fn increment (x: u8) -> u8 { x + 1 }
fn main() {
let x = 255;
println!("x: {}, x+1: {}", x, increment (x));
}
Playground
If you run this code in debug mode, you get:
thread 'main' panicked at 'attempt to add with overflow', src/main.rs:1:30
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
But if you run it in release mode, you get:
x: 255, x+1: 0
I'm trying to learn Rust using the Rust book and the Exercism.io website.
I have an issue with this specific exercise. The code is as follows:
pub fn series(_digits: &str, _len: usize) -> Vec<String> {
(0.._digits.len() + 1 - _len)
.map(|i| _digits[i..i + _len].to_string())
.collect()
}
For example, series("12345", 3)should return a Vec containing ["123", "234", "345"].
Instead of (0.._digits.len() + 1 - _len), I experimented using (0.._digits.len() - _len + 1) instead, but in this case, the unit test "test_too_long" fails:
#[test]
#[ignore]
fn test_too_long() {
let expected: Vec<String> = vec![];
assert_eq!(series("92017", 6), expected);
}
I'm surprised because it looks like it's the same to me. Why did it fail?
This happens because in debug mode, arithmetic operations that would overflow instead panic, and panicking causes tests to fail.
With the rearranged version (playground), in series("12345", 6), digits.len() - len + 1 becomes 5usize - 6usize + 1usize. The program doesn't even get to the + 1, because just 5usize - 6usize panics. (usize can't represent negative numbers, so subtracting 6 from 5 causes overflow.)
The error message contains a strong hint at the nature of the failure:
---- test_too_long stdout ----
thread 'test_too_long' panicked at 'attempt to subtract with overflow', src/lib.rs:2:9
note: Run with `RUST_BACKTRACE=1` for a backtrace.
digits.len() + 1 - len works, however, because 6 is exactly one more than the length of the string, and so 5 + 1 - 6 can evaluate to zero without overflow. But if you change test_too_long to call series("12345", 7) instead, both versions panic. This seems like an oversight on the part of whoever wrote the test suite, especially considering that the instructions don't specify the expected behavior:
And if you ask for a 6-digit series from a 5-digit string, you deserve whatever you get.
For what it's worth, here's one way to make series return an empty vector for any len greater than the length of the input: (digits.len() + 1).saturating_sub(len) is like digits.len() + 1 - len, but if the result of the subtraction would be less than 0, it just returns 0.
In Rust, there are two possibilities to take a reference
Borrow, i.e., take a reference but don't allow mutating the reference destination. The & operator borrows ownership from a value.
Borrow mutably, i.e., take a reference to mutate the destination. The &mut operator mutably borrows ownership from a value.
The Rust documentation about borrowing rules says:
First, any borrow must last for a scope no greater than that of the
owner. Second, you may have one or the other of these two kinds of
borrows, but not both at the same time:
one or more references (&T) to a resource,
exactly one mutable reference (&mut T).
I believe that taking a reference is creating a pointer to the value and accessing the value by the pointer. This could be optimized away by the compiler if there is a simpler equivalent implementation.
However, I don't understand what move means and how it is implemented.
For types implementing the Copy trait it means copying e.g. by assigning the struct member-wise from the source, or a memcpy(). For small structs or for primitives this copy is efficient.
And for move?
This question is not a duplicate of What are move semantics? because Rust and C++ are different languages and move semantics are different between the two.
Semantics
Rust implements what is known as an Affine Type System:
Affine types are a version of linear types imposing weaker constraints, corresponding to affine logic. An affine resource can only be used once, while a linear one must be used once.
Types that are not Copy, and are thus moved, are Affine Types: you may use them either once or never, nothing else.
Rust qualifies this as a transfer of ownership in its Ownership-centric view of the world (*).
(*) Some of the people working on Rust are much more qualified than I am in CS, and they knowingly implemented an Affine Type System; however contrary to Haskell which exposes the math-y/cs-y concepts, Rust tends to expose more pragmatic concepts.
Note: it could be argued that Affine Types returned from a function tagged with #[must_use] are actually Linear Types from my reading.
Implementation
It depends. Please keep in mind than Rust is a language built for speed, and there are numerous optimizations passes at play here which will depend on the compiler used (rustc + LLVM, in our case).
Within a function body (playground):
fn main() {
let s = "Hello, World!".to_string();
let t = s;
println!("{}", t);
}
If you check the LLVM IR (in Debug), you'll see:
%_5 = alloca %"alloc::string::String", align 8
%t = alloca %"alloc::string::String", align 8
%s = alloca %"alloc::string::String", align 8
%0 = bitcast %"alloc::string::String"* %s to i8*
%1 = bitcast %"alloc::string::String"* %_5 to i8*
call void #llvm.memcpy.p0i8.p0i8.i64(i8* %1, i8* %0, i64 24, i32 8, i1 false)
%2 = bitcast %"alloc::string::String"* %_5 to i8*
%3 = bitcast %"alloc::string::String"* %t to i8*
call void #llvm.memcpy.p0i8.p0i8.i64(i8* %3, i8* %2, i64 24, i32 8, i1 false)
Underneath the covers, rustc invokes a memcpy from the result of "Hello, World!".to_string() to s and then to t. While it might seem inefficient, checking the same IR in Release mode you will realize that LLVM has completely elided the copies (realizing that s was unused).
The same situation occurs when calling a function: in theory you "move" the object into the function stack frame, however in practice if the object is large the rustc compiler might switch to passing a pointer instead.
Another situation is returning from a function, but even then the compiler might apply "return value optimization" and build directly in the caller's stack frame -- that is, the caller passes a pointer into which to write the return value, which is used without intermediary storage.
The ownership/borrowing constraints of Rust enable optimizations that are difficult to reach in C++ (which also has RVO but cannot apply it in as many cases).
So, the digest version:
moving large objects is inefficient, but there are a number of optimizations at play that might elide the move altogether
moving involves a memcpy of std::mem::size_of::<T>() bytes, so moving a large String is efficient because it only copies a couple bytes whatever the size of the allocated buffer they hold onto
When you move an item, you are transferring ownership of that item. That's a key component of Rust.
Let's say I had a struct, and then I assign the struct from one variable to another. By default, this will be a move, and I've transferred ownership. The compiler will track this change of ownership and prevent me from using the old variable any more:
pub struct Foo {
value: u8,
}
fn main() {
let foo = Foo { value: 42 };
let bar = foo;
println!("{}", foo.value); // error: use of moved value: `foo.value`
println!("{}", bar.value);
}
how it is implemented.
Conceptually, moving something doesn't need to do anything. In the example above, there wouldn't be a reason to actually allocate space somewhere and then move the allocated data when I assign to a different variable. I don't actually know what the compiler does, and it probably changes based on the level of optimization.
For practical purposes though, you can think that when you move something, the bits representing that item are duplicated as if via memcpy. This helps explain what happens when you pass a variable to a function that consumes it, or when you return a value from a function (again, the optimizer can do other things to make it efficient, this is just conceptually):
// Ownership is transferred from the caller to the callee
fn do_something_with_foo(foo: Foo) {}
// Ownership is transferred from the callee to the caller
fn make_a_foo() -> Foo { Foo { value: 42 } }
"But wait!", you say, "memcpy only comes into play with types implementing Copy!". This is mostly true, but the big difference is that when a type implements Copy, both the source and the destination are valid to use after the copy!
One way of thinking of move semantics is the same as copy semantics, but with the added restriction that the thing being moved from is no longer a valid item to use.
However, it's often easier to think of it the other way: The most basic thing that you can do is to move / give ownership away, and the ability to copy something is an additional privilege. That's the way that Rust models it.
This is a tough question for me! After using Rust for a while the move semantics are natural. Let me know what parts I've left out or explained poorly.
Rust's move keyword always bothers me so, I decided to write my understanding which I obtained after discussion with my colleagues.
I hope this might help someone.
let x = 1;
In the above statement, x is a variable whose value is 1. Now,
let y = || println!("y is a variable whose value is a closure");
So, move keyword is used to transfer the ownership of a variable to the closure.
In the below example, without move, x is not owned by the closure. Hence x is not owned by y and available for further use.
let x = 1;
let y = || println!("this is a closure that prints x = {}". x);
On the other hand, in this next below case, the x is owned by the closure. x is owned by y and not available for further use.
let x = 1;
let y = move || println!("this is a closure that prints x = {}". x);
By owning I mean containing as a member variable. The example cases above are in the same situation as the following two cases. We can also assume the below explanation as to how the Rust compiler expands the above cases.
The formar (without move; i.e. no transfer of ownership),
struct ClosureObject {
x: &u32
}
let x = 1;
let y = ClosureObject {
x: &x
};
The later (with move; i.e. transfer of ownership),
struct ClosureObject {
x: u32
}
let x = 1;
let y = ClosureObject {
x: x
};
Please let me answer my own question. I had trouble, but by asking a question here I did Rubber Duck Problem Solving. Now I understand:
A move is a transfer of ownership of the value.
For example the assignment let x = a; transfers ownership: At first a owned the value. After the let it's x who owns the value. Rust forbids to use a thereafter.
In fact, if you do println!("a: {:?}", a); after the letthe Rust compiler says:
error: use of moved value: `a`
println!("a: {:?}", a);
^
Complete example:
#[derive(Debug)]
struct Example { member: i32 }
fn main() {
let a = Example { member: 42 }; // A struct is moved
let x = a;
println!("a: {:?}", a);
println!("x: {:?}", x);
}
And what does this move mean?
It seems that the concept comes from C++11. A document about C++ move semantics says:
From a client code point of view, choosing move instead of copy means that you don't care what happens to the state of the source.
Aha. C++11 does not care what happens with source. So in this vein, Rust is free to decide to forbid to use the source after a move.
And how it is implemented?
I don't know. But I can imagine that Rust does literally nothing. x is just a different name for the same value. Names usually are compiled away (except of course debugging symbols). So it's the same machine code whether the binding has the name a or x.
It seems C++ does the same in copy constructor elision.
Doing nothing is the most efficient possible.
Passing a value to function, also results in transfer of ownership; it is very similar to other examples:
struct Example { member: i32 }
fn take(ex: Example) {
// 2) Now ex is pointing to the data a was pointing to in main
println!("a.member: {}", ex.member)
// 3) When ex goes of of scope so as the access to the data it
// was pointing to. So Rust frees that memory.
}
fn main() {
let a = Example { member: 42 };
take(a); // 1) The ownership is transfered to the function take
// 4) We can no longer use a to access the data it pointed to
println!("a.member: {}", a.member);
}
Hence the expected error:
post_test_7.rs:12:30: 12:38 error: use of moved value: `a.member`
let s1:String= String::from("hello");
let s2:String= s1;
To ensure memory safety, rust invalidates s1, so instead of being shallow copy, this called a Move
fn main() {
// Each value in rust has a variable that is called its owner
// There can only be one owner at a time.
let s=String::from('hello')
take_ownership(s)
println!("{}",s)
// Error: borrow of moved value "s". value borrowed here after move. so s cannot be borrowed after a move
// when we pass a parameter into a function it is the same as if we were to assign s to another variable. Passing 's' moves s into the 'my_string' variable then `println!("{}",my_string)` executed, "my_string" printed out. After this scope is done, some_string gets dropped.
let x:i32 = 2;
makes_copy(x)
// instead of being moved, integers are copied. we can still use "x" after the function
//Primitives types are Copy and they are stored in stack because there size is known at compile time.
println("{}",x)
}
fn take_ownership(my_string:String){
println!('{}',my_string);
}
fn makes_copy(some_integer:i32){
println!("{}", some_integer)
}