References and move vs. copy - rust

I'm reading through chapter 4 of the Rust book, coming from a Python background.
What confuses me a bit is that this doesn't compile:
fn do_something(s3: String) {
println!("{}", s3);
}
fn main() {
let s = String::from("Hello");
do_something(s);
println!("{}", s);
}
(as it shouldn't, because s is moved in the function call), but this does:
fn do_something(s3: &String) {
println!("{}", s3);
}
fn main() {
let s1 = String::from("Hello");
let s2 = &s1;
do_something(s2);
println!("{}", s2);
}
if I understand correctly, this is because s1 is already a pointer, so s2 is a pointer (well, a refrence) to a (stack-allocated) pointer. Therefore, s2 is copied in the function call and not moved. Is this so? Is the above the same as this?
fn main() {
let a = 5;
let b = &a;
do_something(b);
println!("{}", b);
}

fn main() {
let s1 = String::from("Hello");
let s2 = &s1;
do_something(s2);
println!("{}", s2);
}
This works because references are Copy. s2 is copied, not moved, and can continue to be used after the call to do_something.
Is the above the same as this?
fn main() {
let a = 5;
let b = &a;
do_something(b);
println!("{}", b);
}
Yes, it is. It doesn't matter if a or s1 are Copy, only that b or s2 are.
I guess my question is, are they Copy because they are always pointers to stack-allocated data?
All shared references are Copy because of the blanket implementation:
impl<'_, T> Copy for &'_ T
where
T: ?Sized,
It doesn't matter where the references point, stack or heap or .data section or anywhere else. References are small word-sized values and it's convenient and efficient to copy them, so Rust makes them Copy.
That's the usual metric: Is copying efficient? Strings aren't Copy because cloning them is an expensive O(n) operation. &strs, being references, are Copy because copying pointers is cheap.
(It's not the only metric, of course. Some types are not Copy because they are logically unique. For example, Box is also just a pointer, but it's an owned type, not a shared reference.)

Related

Borrowed value of RefCell does not live long enough

I created two example codes, one using references of a Box and the other using borrow_mut of a RefCell and they don't work the same, to my surprise.
use std::cell::RefCell;
#[derive(Debug)]
struct A {
a: usize,
}
fn main() {
let m;
let mut n = Box::new(A{a:5});
{let a = &mut n; m = &mut a.a;}
*m = 6;
println!("{}", m);
println!("{:?}", n);
let mm;
let mut nn = RefCell::new(A{a:5});
{let mut aa = nn.borrow_mut(); mm = &mut aa.a;}
*mm = 6;
println!("{:?}", mm);
println!("{:?}", nn);
}
I'm getting an error saying the aa borrowed value does not live long enough. &mut aa.a should be &mut usize so nothing to do with a RefCell and should work just like the Box example. Why am I getting this error?
The problem is that while the borrow &mut n is just a pure reference, nn.borrow_mut() is not. It returns a RefMut that implements Drop - because it needs to mark the RefCell as no longer borrowed after it is dropped. Therefore, the compiler can extend the lifetime of &mut n as it doesn't do anything when dropped, but it cannot do so for nn.borrow_mut() because it alters the behavior of the program.

How function that returns structure work? [duplicate]

In the below example:
struct Foo {
a: [u64; 100000],
}
fn foo(mut f: Foo) -> Foo {
f.a[0] = 99999;
f.a[1] = 99999;
println!("{:?}", &mut f as *mut Foo);
for i in 0..f.a[0] {
f.a[i as usize] = 21444;
}
return f;
}
fn main(){
let mut f = Foo {
a:[0;100000]
};
println!("{:?}", &mut f as *mut Foo);
f = foo(f);
println!("{:?}", &mut f as *mut Foo);
}
I find that before and after passing into the function foo, the address of f is different. Why does Rust copy such a big struct everywhere but not actually move it (or achieve this optimization)?
I understand how stack memory works. But with the information provided by ownership in Rust, I think the copy can be avoided. The compiler unnecessarily copies the array twice. Can this be an optimization for the Rust compiler?
A move is a memcpy followed by treating the source as non-existent.
Your big array is on the stack. That's just the way Rust's memory model works: local variables are on the stack. Since the stack space of foo is going away when the function returns, there's nothing else the compiler can do except copy the memory to main's stack space.
In some cases, the compiler can rearrange things so that the move can be elided (source and destination are merged into one thing), but this is an optimization that cannot be relied on, especially for big things.
If you don't want to copy the huge array around, allocate it on the heap yourself, either via a Box<[u64]>, or simply by using Vec<u64>.

A bit problem about Rust's function parameter and ownership [duplicate]

This question already has an answer here:
Do mutable references have move semantics?
(1 answer)
Closed 1 year ago.
here's my problem:
fn main() {
let mut s = String::from("hello");
let s1 = &mut s;
let s2 = s1;
*s2 = String::from("world1");
*s1 = String::from("world2");
println!("{:?}", s);
}
it will result in a compile error because s1 has type &mut String which doesn't implement the Copy trait.
But if I change the code as below:
fn c(s: &mut String) -> &mut String {
s
}
fn main() {
let mut s = String::from("hello");
let s1 = &mut s;
let s2 = c(s1);
*s2 = String::from("world1");
*s1 = String::from("world2");
println!("{:?}", s);
}
it will compile without any error message.
I know when a reference passed to a function, it means the reference borrows the value insteading of owning it.
But in the situation above, it seems like when s1 was passed to fn c and returned immediatelly, s2 borrowed s1 so s1 couldn't be derefed until s2 was out of it's lifetime scope.
So what happened when s1 was passed into the fn c?
From #Denys Séguret's hint, I guess when s1 was passed to fn C, Rust core compiled the parameter s1 to something like &mut *s1, so there was an immutable borrow of s1.
That's why if we put
*s2 = String::from("world1");
behind
*s1 = String::from("world2");
Rust would tell us:
assignment to borrowed `*s1`
And when s2 goes out of it's lifetime scope, there is no borrow of s1 anymore, so s1 can be derefed again.
But I'm not quite sure whether it's a right explanation.

Why does "move" in Rust not actually move?

In the below example:
struct Foo {
a: [u64; 100000],
}
fn foo(mut f: Foo) -> Foo {
f.a[0] = 99999;
f.a[1] = 99999;
println!("{:?}", &mut f as *mut Foo);
for i in 0..f.a[0] {
f.a[i as usize] = 21444;
}
return f;
}
fn main(){
let mut f = Foo {
a:[0;100000]
};
println!("{:?}", &mut f as *mut Foo);
f = foo(f);
println!("{:?}", &mut f as *mut Foo);
}
I find that before and after passing into the function foo, the address of f is different. Why does Rust copy such a big struct everywhere but not actually move it (or achieve this optimization)?
I understand how stack memory works. But with the information provided by ownership in Rust, I think the copy can be avoided. The compiler unnecessarily copies the array twice. Can this be an optimization for the Rust compiler?
A move is a memcpy followed by treating the source as non-existent.
Your big array is on the stack. That's just the way Rust's memory model works: local variables are on the stack. Since the stack space of foo is going away when the function returns, there's nothing else the compiler can do except copy the memory to main's stack space.
In some cases, the compiler can rearrange things so that the move can be elided (source and destination are merged into one thing), but this is an optimization that cannot be relied on, especially for big things.
If you don't want to copy the huge array around, allocate it on the heap yourself, either via a Box<[u64]>, or simply by using Vec<u64>.

What do the ampersand '&' and star '*' symbols mean in Rust?

Despite thoroughly reading the documentation, I'm rather confused about the meaning of the & and * symbol in Rust, and more generally about what is a Rust reference exactly.
In this example, it seems to be similar to a C++ reference (that is, an address that is automatically dereferenced when used):
fn main() {
let c: i32 = 5;
let rc = &c;
let next = rc + 1;
println!("{}", next); // 6
}
However, the following code works exactly the same:
fn main() {
let c: i32 = 5;
let rc = &c;
let next = *rc + 1;
println!("{}", next); // 6
}
Using * to dereference a reference wouldn't be correct in C++. So I'd like to understand why this is correct in Rust.
My understanding so far, is that, inserting * in front of a Rust reference dereferences it, but the * is implicitly inserted anyway so you don't need to add it (while in C++, it's implicitly inserted and if you insert it you get a compilation error).
However, something like this doesn't compile:
fn main() {
let mut c: i32 = 5;
let mut next: i32 = 0;
{
let rc = &mut c;
next = rc + 1;
}
println!("{}", next);
}
error[E0369]: binary operation `+` cannot be applied to type `&mut i32`
--> src/main.rs:6:16
|
6 | next = rc + 1;
| ^^^^^^
|
= note: this is a reference to a type that `+` can be applied to; you need to dereference this variable once for this operation to work
= note: an implementation of `std::ops::Add` might be missing for `&mut i32`
But this works:
fn main() {
let mut c: i32 = 5;
let mut next: i32 = 0;
{
let rc = &mut c;
next = *rc + 1;
}
println!("{}", next); // 6
}
It seems that implicit dereferencing (a la C++) is correct for immutable references, but not for mutable references. Why is this?
Using * to dereference a reference wouldn't be correct in C++. So I'd like to understand why this is correct in Rust.
A reference in C++ is not the same as a reference in Rust. Rust's references are much closer (in usage, not in semantics) to C++'s pointers. With respect to memory representation, Rust's references often are just a single pointer, while C++'s references are supposed to be alternative names of the same object (and thus have no memory representation).
The difference between C++ pointers and Rust references is that Rust's references are never NULL, never uninitialized and never dangling.
The Add trait is implemented (see the bottom of the doc page) for the following pairs and all other numeric primitives:
&i32 + i32
i32 + &i32
&i32 + &i32
This is just a convenience thing the std-lib developers implemented. The compiler can figure out that a &mut i32 can be used wherever a &i32 can be used, but that doesn't work (yet?) for generics, so the std-lib developers would need to also implement the Add traits for the following combinations (and those for all primitives):
&mut i32 + i32
i32 + &mut i32
&mut i32 + &mut i32
&mut i32 + &i32
&i32 + &mut i32
As you can see that can get quite out of hand. I'm sure that will go away in the future. Until then, note that it's rather rare to end up with a &mut i32 and trying to use it in a mathematical expression.
This answer is for those looking for the basics (e.g. coming from Google).
From the Rust book's References and Borrowing:
fn main() {
let s1 = String::from("hello");
let len = calculate_length(&s1);
println!("The length of '{}' is {}.", s1, len);
}
fn calculate_length(s: &String) -> usize {
s.len()
}
These ampersands represent references, and they allow you to refer to some value without taking ownership of it [i.e. borrowing].
The opposite of referencing by using & is dereferencing, which is accomplished with the dereference operator, *.
And a basic example:
let x = 5;
let y = &x; //set y to a reference to x
assert_eq!(5, x);
assert_eq!(5, *y); // dereference y
If we tried to write assert_eq!(5, y); instead, we would get a compilation error can't compare `{integer}` with `&{integer}`.
(You can read more in the Smart Pointers chapter.)
And from Method Syntax:
Rust has a feature called automatic referencing and dereferencing. Calling methods is one of the few places in Rust that has this behavior.
Here’s how it works: when you call a method with object.something(), Rust automatically adds in &, &mut, or * so object matches the signature of the method. In other words, the following are the same:
p1.distance(&p2);
(&p1).distance(&p2);
From the docs for std::ops::Add:
impl<'a, 'b> Add<&'a i32> for &'b i32
impl<'a> Add<&'a i32> for i32
impl<'a> Add<i32> for &'a i32
impl Add<i32> for i32
It seems the binary + operator for numbers is implemented for combinations of shared (but not mutable) references of the operands and owned versions of the operands. It has nothing to do with automatic dereferencing.

Resources