Understanding struct-field mutation - struct

From the Rust book about how to mutate struct fields:
let mut point = Point { x: 0, y: 0 };
point.x = 5;
and later:
Mutability is a property of the binding, not of the structure itself.
This seems counter-intuitive to me because point.x = 5 doesn't look like I'm rebinding the variable point. Is there a way to explain this so it's more intuitive?
The only way I can wrap my head around this is to "imagine" that I'm rebinding point to a copy of the original Point with a different x value (not even sure that's accurate).

This seems counter-intuitive to me because point.x = 5 doesn't look like I'm rebinding the variable point. Is there a way to explain this so it's more intuitive?
All this is saying is that whether or not something is mutable is determined by the let- statement (the binding) of the variable, as opposed to being a property of the type or any specific field.
In the example, point and its fields are mutable because point is introduced in a let mut statement (as opposed to a simple let statement) and not because of some property of the Point type in general.
As a contrast, to show why this is interesting: in other languages, like OCaml, you can mark certain fields mutable in the definition of the type:
type point =
{ x: int;
mutable y: int;
};
This means that you can mutate the y field of every point value, but you can never mutate x.

I had the same confusion. For me it came from two separate misunderstandings. First, I came from a language where variables (aka bindings) were implicitly references to values. In that language it was important to distinguish between mutating the reference, and mutating the value that was referred to. Second, I thought by "the structure itself" the book was referring to the instantiated value, but by "the structure" it means the specification/declaration, not a particular value of that type.
Variables in Rust are different. From the reference:
A variable is a component of a stack frame...
A local variable (or stack-local allocation) holds a value directly,
allocated within the stack's memory. The value is a part of the stack
frame.
So a variable is a component of a stack frame -- a chunk of memory -- that directly holds the value. There is no reference to distinguish from the value itself, no reference to mutate. The variable and the value are the same hunk of memory.
A consequence is that rebinding a variable in the sense of changing it to refer to a different hunk of memory is not compatible with Rust's memory model. (n.b. let x = 1; let x = 2; creates two variables.)
So the book is pointing out that mutability is declared at the "per hunk of memory" level rather than as part of the definition of a struct.
The only way I can wrap my head around this is to "imagine" that I'm
rebinding point to a copy of the original Point with a different x
value (not even sure that's accurate)
Instead imagine you are changing one of the 0's in a hunk of memory to a 5; and that that value resides within the memory designated by point. Interpret "the binding is mutable" to mean that you can mutate the hunk of memory designated by the binding, including mutating just part of it, e.g. by setting a struct field. Think of rebinding Rust variables in the way that you describe as not expressible within Rust.

Here "binding" is not a verb, it is a noun. You can say that in Rust bindings are synonymous to variables. Therefore, you can read that passage like
Mutability is a property of the variable, not of the structure itself.
Now, I guess, it should be clear - you mark the variable as mutable and so you can modify its contents.

#m-n's answer set me on the right track. It's all about stack addresses! Here's a demonstration that solidified in my mind what's actually happening.
struct Point {
x: i64,
y: i64,
}
fn main() {
{
println!("== clobber binding");
let a = 1;
println!("val={} | addr={:p}", a, &a);
// This is completely new variable, with a different stack address
let a = 2;
println!("val={} | addr={:p}", a, &a);
}
{
println!("== reassign");
let mut b = 1;
println!("val={} | addr={:p}", b, &b);
// uses same stack address
b = 2;
println!("val={} | addr={:p}", b, &b);
}
{
println!("== Struct: clobber binding");
let p1 = Point{ x: 1, y: 2 };
println!(
"xval,yval=({}, {}) | pointaddr={:p}, xaddr={:p}, yaddr={:p}",
p1.x, p1.y, &p1, &p1.x, &p1.y);
let p1 = Point{ x: 3, y: 4 };
println!(
"xval,yval=({}, {}) | pointaddr={:p}, xaddr={:p}, yaddr={:p}",
p1.x, p1.y, &p1, &p1.x, &p1.y);
}
{
println!("== Struct: reassign");
let mut p1 = Point{ x: 1, y: 2 };
println!(
"xval,yval=({}, {}) | pointaddr={:p}, xaddr={:p}, yaddr={:p}",
p1.x, p1.y, &p1, &p1.x, &p1.y);
// each of these use the same addresses; no new addresses
println!(" (entire struct)");
p1 = Point{ x: 3, y: 4 };
println!(
"xval,yval=({}, {}) | pointaddr={:p}, xaddr={:p}, yaddr={:p}",
p1.x, p1.y, &p1, &p1.x, &p1.y);
println!(" (individual members)");
p1.x = 5; p1.y = 6;
println!(
"xval,yval=({}, {}) | pointaddr={:p}, xaddr={:p}, yaddr={:p}",
p1.x, p1.y, &p1, &p1.x, &p1.y);
}
}
Output (addresses are obviously slightly different per run):
== clobber binding
val=1 | addr=0x7fff6112863c
val=2 | addr=0x7fff6112858c
== reassign
val=1 | addr=0x7fff6112847c
val=2 | addr=0x7fff6112847c
== Struct: clobber binding
xval,yval=(1, 2) | pointaddr=0x7fff611282b8, xaddr=0x7fff611282b8, yaddr=0x7fff611282c0
xval,yval=(3, 4) | pointaddr=0x7fff61128178, xaddr=0x7fff61128178, yaddr=0x7fff61128180
== Struct: reassign
xval,yval=(1, 2) | pointaddr=0x7fff61127fd8, xaddr=0x7fff61127fd8, yaddr=0x7fff61127fe0
(entire struct)
xval,yval=(3, 4) | pointaddr=0x7fff61127fd8, xaddr=0x7fff61127fd8, yaddr=0x7fff61127fe0
(individual members)
xval,yval=(5, 6) | pointaddr=0x7fff61127fd8, xaddr=0x7fff61127fd8, yaddr=0x7fff61127fe0
The key points are these:
Use let to "clobber" an existing binding (new stack address). This happens even if the variable was declared mut, so be careful.
Use mut to reuse the existing stack address, but don't use let when reassigning.
This test reveals a couple of interesting things:
If you reassign an entire mutable struct, it's equivalent to assigning each member individually.
The address of the variable holding the struct is the same as the address of the first member. I guess this makes sense if you're coming from a C/C++ background.

Related

How to rotate a linked list in Rust?

I came across a leetcode example https://leetcode.com/problems/rotate-list/ and wanted to implement it in Rust but I keep moving out of my variables. For instance if I use head to track down the end of the list I cannot move its value to the last next because it is already moved out of the variable. How should I think to avoid this kind of problems?
And this is my attempt in rust
// Definition for singly-linked list.
#[derive(PartialEq, Eq, Clone, Debug)]
pub struct ListNode {
pub val: i32,
pub next: Option<Box<ListNode>>,
}
impl ListNode {
#[inline]
fn new(val: i32) -> Self {
ListNode { next: None, val }
}
}
pub fn rotate_right(head: Option<Box<ListNode>>, k: i32) -> Option<Box<ListNode>> {
if k == 0 {
return head;
}
let mut len = 0;
let mut curr = &head;
while let Some(n) = curr {
curr = &n.next;
len += 1;
}
drop(curr);
let newk = len - ( k % len);
println!("Rotate {} right is {} mod {}",k,newk,len);
let mut node = head.as_ref();
for _i in 0..newk {
if let Some(n) = node {
node = n.next.as_ref();
}else {
println!("Unexpected asswer fail");
}
}
let mut newhead = None ;// node.unwrap().next; // node will be last and newhead will be first.
if let Some(mut n) = node{ // node should have a value here
newhead = n.next;
n.next = None
}
drop(node);
let node = newhead;
if let Some(c) = node{
let mut curr = c;
while let Some(next) = curr.next {
curr = next;
}
curr.next = head;
newhead
}else{
println! ("Todo handle this corner case");
None
}
}
But this yields errors about moving and use of borrowed references.
error[E0507]: cannot move out of `n.next` which is behind a shared reference
--> src/main.rs:99:23
|
99 | newhead = n.next;
| ^^^^^^
| |
| move occurs because `n.next` has type `Option<Box<ListNode>>`, which does not implement the `Copy` trait
| help: consider borrowing the `Option`'s content: `n.next.as_ref()`
error[E0594]: cannot assign to `n.next` which is behind a `&` reference
--> src/main.rs:100:13
|
98 | if let Some(mut n) = node{ // node should have a value here
| ----- help: consider changing this to be a mutable reference: `&mut Box<ListNode>`
99 | newhead = n.next;
100 | n.next = None
| ^^^^^^ `n` is a `&` reference, so the data it refers to cannot be written
error[E0382]: use of moved value: `newhead`
--> src/main.rs:111:13
|
97 | let mut newhead = None ;// node.unwrap().next; // node will be last and newhead will be first.
| ----------- move occurs because `newhead` has type `Option<Box<ListNode>>`, which does not implement the `Copy` trait
...
104 | let node = newhead;
| ------- value moved here
...
111 | newhead
| ^^^^^^^ value used here after move
Finlay this is reference code in c of what I intended it to look like:
/**
* Definition for singly-linked list.
* struct ListNode {
* int val;
* struct ListNode *next;
* };
*/
struct ListNode* rotateRight(struct ListNode* head, int k){
struct ListNode* node = head;
int len = 1;
if (k == 0 || !head){
return head;
}
while (node->next){
node = node->next;
len++;
}
int newk = len - (k % len) -1;
node = head;
while (newk--){
if (node->next) {
node = node->next;
}
}
if ( !node->next) {
return head; // do not move anything
}
struct ListNode* newhead = node->next; // node will be last and newhead will be first.
struct ListNode* n = node->next;
node->next = 0;
node = n;
while (node->next != 0) {
node = node->next;
}
node->next = head;
return newhead;
}
So my question is what technique can be used to avoid this move use trap?
I'd like to preface this post by saying that this type of Linked List in Rust is not a particularly good representation. Unfortunately, safe Rust's ownership model makes reference chain structures such as Linked Lists problematic and hard to reason about for any more than simple operations, despite Linked Lists being very simple in other languages. While for beginners, simple but inefficient implementations of simple concepts are often important and useful, I would also like to emphasize that the safe Rust "sentinel model" for trees and linked lists also has a heavy performance cost. Learning Rust With Entirely Too Many Linked Lists is the de facto book for learning how to properly deal with these structures, and I recommend it to all beginners and intermediate Rust programmers who haven't worked through it (or at least skimmed it).
However, this particular operation is possible on sentinel model singly linked lists in purely safe Rust. The whole version of the code I'm about to present is available on the Playground, which includes some things like test cases and an iterator implementation I'll be glossing over (since it doesn't get to the core of the problem and is only used in test cases and to derive the entire size of the Linked List). I also won't be presenting the ListNode struct or the new function, since they are in the question itself. Note that I also took some design liberties (instead of a free function, I made rotate_right and rotate_left member functions, you can freely convert but this is more Rusty design).
The "obvious' implementation here is to drain the entire list and reconstruct it. This is a perfectly good solution! However it's likely inefficient (it can be more or less inefficient depending on whether you totally destroy the nodes or not). Instead we're going to be doing what you largely do in other languages: only breaking the link and then reassigning the pointers at the boundaries. In fact, once you break it down into these two steps, the problem becomes much easier! All that's required is knowledge about Rust's mutable borrowing semantics, and how to overcome the pesky requirement that references always have a definite value and cannot be moved out of.
1. Breaking the Link: mutable reference uniqueness & memory integrity
/// Splits the list at a given index, returning the new split off tail.
/// Self remains unmodified.
/// The list is zero indexed, so,
/// for instance, a value of 2 in the list [1,2,3,4,5]
/// will cause self to be [1,2,3] and the return value to be
/// Some([4,5]). I.E. the link is "broken" AT index 2.
///
/// If there's nothing to split off, `None` is returned.
fn split_at(&mut self, n: usize) -> Option<Self> {
use std::mem;
let mut split = Some(self);
for _ in 0..n {
if let Some(node) = split.map(|v| v.next.as_deref_mut()) {
split = node;
} else {
return None;
}
}
match split {
Some(node) => {
let mut new_head = None;
mem::swap(&mut new_head, &mut node.next);
new_head.map(|v| *v)
}
None => None
}
}
The logic is straightforward (if you've broken off a list in any other language it's the exact same algorithm). The important thing for Rust is the ownership system and the mem::swap function. Rust ensures two things that are problematic here:
You're not mutably borrowing the same thing twice.
No piece of memory is invalid, even temporarily (particularly in self).
For 1., we use the code
if let Some(node) = split.map(|v| v.next.as_deref_mut()) {
split = node;
}
What this does is simply advances the pointer, making sure to immediately "forget" our current mutable pointer, and never do anything that directly causes "self" to be used at the same time as "node". (Note that if you tried this in an earlier version of Rust, I don't believe this was possible before Non-Lexical Lifetimes (NLLs), I think it would both alias self and node under the old rules). The map is simply to allow us to reassign to split by "reborrowing" the inner Box as an actual reference, so we can recycle the seed variable we start as self (since you can't reassign self in this way).
For solving 2. we use:
match split {
Some(node) => {
let mut new_head = None;
mem::swap(&mut new_head, &mut node.next);
new_head.map(|v| *v)
}
None => None
}
The key here is the mem::swap line. Without this Rust will complain about you moving node.next. Safe Rust requires you to ensure there is always a value in every variable. The only exception is if you're permanently destroying a value, which doesn't work on references. I think there's some QOL work on the compiler at the moment to allow you to move out of a mutable reference if you immediately put something in its place and the compiler can prove there will be no panic or return between these operations (i.e. memory cannot become corrupted during a stack unwind), but as of Rust 1.47 this is not available.
2. Stitching the list back together: memory integrity
/// Attaches the linked list `new_tail` to the end
/// of `self`. For instance
/// `[1,2,3]` and `[4,5]` will make self a list containing
/// `[1,2,3,4,5]`.
fn extend_from_list(&mut self, new_tail: ListNode) {
let mut tail = self;
while tail.next.is_some() {
tail = tail.next.as_deref_mut().unwrap();
}
tail.next = Some(Box::new(new_tail))
}
This method isn't too complicated, it just assigns the next reference to the head node of the passed in list. If you've written a function to attach a new node it's functionally the same. Again, the key is that we advance the pointer to the end, being very careful we never allow a mutable reference to alias, which allows us to appease the Borrow Checker.
From there, we can finish up!
/// Rotates the list `k` elements left. So
/// `[1,2,3]` rotated 2 left is `[3,1,2]`.
///
/// The function handles rotating an arbitrary number of steps
/// (i.e. rotating a size 3 list by 5 is the same as rotating by 2).
fn rotate_left(&mut self, k: usize) {
use std::mem;
let k = k % self.iter().count();
if k == 0 {
return;
}
let k = k-1;
if let Some(mut new_head) = self.split_at(k) {
mem::swap(self, &mut new_head);
let new_tail = new_head;
self.extend_from_list(new_tail);
} else {
unreachable!("We should handle this with the mod!")
}
}
/// Rotates the list `k` elements right. So
/// `[1,2,3]` rotated 2 right is `[2,3,1]`.
///
/// The function handles rotating an arbitrary number of steps
/// (i.e. rotating a size 3 list by 5 is the same as rotating by 2).
fn rotate_right(&mut self, k: usize) {
self.rotate_left(self.iter().count() - k)
}
Rotating left is conceptually easiest, since it's just splitting a list at the given index (k-1 because to rotate k places we need to break the link after counting one fewer indices). Again, we need to make use of mem::swap, the only tricky thing about this code, because the part we split off is the new head and Rust won't let us use temp variables due to the memory integrity guarantee.
Also, it turns out rotating right k places is just rotating left size-k places, and is much easier to conceptualize that way.
This is all somewhat inefficient though because we're marching to the end of the list each time (particularly for the list fusion), because we're not holding onto the tail pointer and instead re-iterating. I'm fairly certain you can fix this, but it may be easier to make it one large function instead of sub-functions, since returning a downstream tail pointer would require some lifetime flagging. I think if you want to take a next step (besides the Linked List book, which is probably a better use of your time), it would be good to see if you can preserve the tail pointer to eliminate the constant marching down the list.

Immutable variables in RUST can be reassigned using destructuring? [duplicate]

This question already has an answer here:
In Rust, what's the difference between "shadowing" and "mutability"?
(1 answer)
Closed 2 years ago.
I was quite surprised to find that the following program will happily compile and run (using "cargo 1.42.0 (86334295e 2020-01-31)."), outputting:
5
k
The variable x which is not declared as mut is not only reassigned but reassigned with a different type.
Is there some reason why you are allowed to do this?
fn main() {
let x = 5;
println!("{}", x);
let t: (i32, f64, char) = (2, 3.14, 'k');
let (_,_,x) = t;
println!("{}", x);
}
This is called "shadowing a variable"
(https://doc.rust-lang.org/book/ch03-01-variables-and-mutability.html#shadowing)
It can also been shown as simply as so:
let x = 5;
let x = 'k';
It actually comes in handy often. For instance, you can reuse an identifier after you are done using its initially assigned value:
let two_times_five = 2 * 5; // type i32
let two_times_five = two_times_five.to_string(); // type String
The compiler will still enforce strong typing; accesses to two_times_five before its redefinition will be accessing an i32, accesses afterward will be accessing a String.
There are also times when you don't want a variable to be mutable, but at some point you want to assign it a different value. Using variable shadowing rather than let mut means you know the variable is not changed in between its definitions, regardless of what functions it's passed into or methods are called on it.

Confused about ref keyword usage in closures

I'm trying to understand how & and ref correspond. Here's an example where I thought these were equivalent, but one works and the other doesn't:
fn main() {
let t = "
aoeu
aoeu
aoeu
a";
let ls = t.lines();
dbg!(ls.clone().map(|l| &l[..]).collect::<Vec<&str>>().join("\n")); // works
dbg!(ls.clone().map(|ref l| l[..]).collect::<Vec<&str>>().join("\n")); // doesn't work
dbg!(ls.clone().map(|ref l| &l[..]).collect::<Vec<&str>>().join("\n")); // works again!
}
From the docs:
// A `ref` borrow on the left side of an assignment is equivalent to
// an `&` borrow on the right side.
let ref ref_c1 = c;
let ref_c2 = &c;
println!("ref_c1 equals ref_c2: {}", *ref_c1 == *ref_c2);
What would the equivalent to |l| &l[..] be with |ref l|? How does it correspond to the assignment examples in the docs?
Taking a look at the docs page for Lines(The iterator adapter for producing lines from a str), we can see that the item produced by it is:
type Item = &'a str;
Therefore the following happens when trying to do the "doesn't work" version:
dbg!(ls.clone().map(|ref l| l[..]).collect::<Vec<&str>>().join("\n")); # doesn't work
//Can become:
let temp = ls
.clone()
.map(|ref l| l[..])
.collect::<Vec<&str>>()
.join("\n");
println!("{}", temp);
Here we can see a crucial problem. l if of type &&str (Which I will explain below) so indexing into it will create a str, which is unsized and therefore cannot be outside of a pointer of some sort.
Now, onto the real thing you wanted to learn: What a ref pattern does:
When doing pattern matching or destructuring via the let binding, the ref keyword can be used to take references to the fields of a struct/tuple.
What this does is the following:
When we have let ref x = y, we take a reference to y.
When pattern matching on something (Like in your closure arguments you showed) we have a slightly different effect: the value under the reference is moved into the scope and then taken reference to while exposing a way to take the value under the reference. For example:
fn foo(ref x: String) {}
let y: fn(String) = foo;
This works because what is essentially being done is this:
fn foo(x: String) {
let x: &String = &x;
}
So what ref x does is take ownership of x and produce a reference to it.
On the other hand
When we have let &x = y we move a value out of y.
This is the opposite of ref, in that we take ownership of the value under y if we can. For example:
let x = 2;
let y = &x;
let &z = y; //Ok, we're moving a `Copy` type
This is only ok for copy types though as though this isn't exactly the same as let x = *y which would work for owned Boxes.

Reference before assignment in Rust

I am playing around with Rust references:
fn main() {
let str = String::from("Hallo");
let &x = &str;
}
This produces the following error:
error[E0507]: cannot move out of borrowed content
--> src/main.rs:3:9
|
3 | let &x = &str;
| ^-
| ||
| |hint: to prevent move, use `ref x` or `ref mut x`
| cannot move out of borrowed content
What is going on here?
Adding to wiomoc's answer: depending on what language(s) you've previously known, variable declaration in Rust might be a little different. Whereas in C/C++ you explicitly have to declare that you want a pointer/reference variable:
int *p = &other_int;
In Rust it's enough to just use let, so the above would in Rust be
let p = &other_int;
and when you write
let &s = &string;
It pattern-matches that, so the Rust compiler reads it roughly as "I know I have a reference, and I want to bind whatever it is referring to to the name p". If you're not familiar with pattern-matching, a more obvious example (that works in Rust as well) would be
let point = (23, 42);
let (x, y) = point;
The second line unpacks the right-hand side to match the left-hand side (both are tuples of two values) and binds the variable names on the left to the values at the same position in the structure on the right. In the case above, it's less obvious that it's matching your "structural description".
The result of let &x = &str;, i.e. "I know &str is a reference, please bind whatever it refers to to the variable x" means that you're trying to have x be the same as str, when at that line all you have is a borrowed reference to str. That's why the compiler tells you you can't create an owned value (which x would be, because it's not being created as a reference) from a reference.
You dont need that &at let x
let str = String::from("Hallo");
let x = &str;
Or if you want to declare the type manually
let string = String::from("Hallo");
let x: &str = &string;

What happens to the stack when a value is moved in Rust? [duplicate]

In Rust, there are two possibilities to take a reference
Borrow, i.e., take a reference but don't allow mutating the reference destination. The & operator borrows ownership from a value.
Borrow mutably, i.e., take a reference to mutate the destination. The &mut operator mutably borrows ownership from a value.
The Rust documentation about borrowing rules says:
First, any borrow must last for a scope no greater than that of the
owner. Second, you may have one or the other of these two kinds of
borrows, but not both at the same time:
one or more references (&T) to a resource,
exactly one mutable reference (&mut T).
I believe that taking a reference is creating a pointer to the value and accessing the value by the pointer. This could be optimized away by the compiler if there is a simpler equivalent implementation.
However, I don't understand what move means and how it is implemented.
For types implementing the Copy trait it means copying e.g. by assigning the struct member-wise from the source, or a memcpy(). For small structs or for primitives this copy is efficient.
And for move?
This question is not a duplicate of What are move semantics? because Rust and C++ are different languages and move semantics are different between the two.
Semantics
Rust implements what is known as an Affine Type System:
Affine types are a version of linear types imposing weaker constraints, corresponding to affine logic. An affine resource can only be used once, while a linear one must be used once.
Types that are not Copy, and are thus moved, are Affine Types: you may use them either once or never, nothing else.
Rust qualifies this as a transfer of ownership in its Ownership-centric view of the world (*).
(*) Some of the people working on Rust are much more qualified than I am in CS, and they knowingly implemented an Affine Type System; however contrary to Haskell which exposes the math-y/cs-y concepts, Rust tends to expose more pragmatic concepts.
Note: it could be argued that Affine Types returned from a function tagged with #[must_use] are actually Linear Types from my reading.
Implementation
It depends. Please keep in mind than Rust is a language built for speed, and there are numerous optimizations passes at play here which will depend on the compiler used (rustc + LLVM, in our case).
Within a function body (playground):
fn main() {
let s = "Hello, World!".to_string();
let t = s;
println!("{}", t);
}
If you check the LLVM IR (in Debug), you'll see:
%_5 = alloca %"alloc::string::String", align 8
%t = alloca %"alloc::string::String", align 8
%s = alloca %"alloc::string::String", align 8
%0 = bitcast %"alloc::string::String"* %s to i8*
%1 = bitcast %"alloc::string::String"* %_5 to i8*
call void #llvm.memcpy.p0i8.p0i8.i64(i8* %1, i8* %0, i64 24, i32 8, i1 false)
%2 = bitcast %"alloc::string::String"* %_5 to i8*
%3 = bitcast %"alloc::string::String"* %t to i8*
call void #llvm.memcpy.p0i8.p0i8.i64(i8* %3, i8* %2, i64 24, i32 8, i1 false)
Underneath the covers, rustc invokes a memcpy from the result of "Hello, World!".to_string() to s and then to t. While it might seem inefficient, checking the same IR in Release mode you will realize that LLVM has completely elided the copies (realizing that s was unused).
The same situation occurs when calling a function: in theory you "move" the object into the function stack frame, however in practice if the object is large the rustc compiler might switch to passing a pointer instead.
Another situation is returning from a function, but even then the compiler might apply "return value optimization" and build directly in the caller's stack frame -- that is, the caller passes a pointer into which to write the return value, which is used without intermediary storage.
The ownership/borrowing constraints of Rust enable optimizations that are difficult to reach in C++ (which also has RVO but cannot apply it in as many cases).
So, the digest version:
moving large objects is inefficient, but there are a number of optimizations at play that might elide the move altogether
moving involves a memcpy of std::mem::size_of::<T>() bytes, so moving a large String is efficient because it only copies a couple bytes whatever the size of the allocated buffer they hold onto
When you move an item, you are transferring ownership of that item. That's a key component of Rust.
Let's say I had a struct, and then I assign the struct from one variable to another. By default, this will be a move, and I've transferred ownership. The compiler will track this change of ownership and prevent me from using the old variable any more:
pub struct Foo {
value: u8,
}
fn main() {
let foo = Foo { value: 42 };
let bar = foo;
println!("{}", foo.value); // error: use of moved value: `foo.value`
println!("{}", bar.value);
}
how it is implemented.
Conceptually, moving something doesn't need to do anything. In the example above, there wouldn't be a reason to actually allocate space somewhere and then move the allocated data when I assign to a different variable. I don't actually know what the compiler does, and it probably changes based on the level of optimization.
For practical purposes though, you can think that when you move something, the bits representing that item are duplicated as if via memcpy. This helps explain what happens when you pass a variable to a function that consumes it, or when you return a value from a function (again, the optimizer can do other things to make it efficient, this is just conceptually):
// Ownership is transferred from the caller to the callee
fn do_something_with_foo(foo: Foo) {}
// Ownership is transferred from the callee to the caller
fn make_a_foo() -> Foo { Foo { value: 42 } }
"But wait!", you say, "memcpy only comes into play with types implementing Copy!". This is mostly true, but the big difference is that when a type implements Copy, both the source and the destination are valid to use after the copy!
One way of thinking of move semantics is the same as copy semantics, but with the added restriction that the thing being moved from is no longer a valid item to use.
However, it's often easier to think of it the other way: The most basic thing that you can do is to move / give ownership away, and the ability to copy something is an additional privilege. That's the way that Rust models it.
This is a tough question for me! After using Rust for a while the move semantics are natural. Let me know what parts I've left out or explained poorly.
Rust's move keyword always bothers me so, I decided to write my understanding which I obtained after discussion with my colleagues.
I hope this might help someone.
let x = 1;
In the above statement, x is a variable whose value is 1. Now,
let y = || println!("y is a variable whose value is a closure");
So, move keyword is used to transfer the ownership of a variable to the closure.
In the below example, without move, x is not owned by the closure. Hence x is not owned by y and available for further use.
let x = 1;
let y = || println!("this is a closure that prints x = {}". x);
On the other hand, in this next below case, the x is owned by the closure. x is owned by y and not available for further use.
let x = 1;
let y = move || println!("this is a closure that prints x = {}". x);
By owning I mean containing as a member variable. The example cases above are in the same situation as the following two cases. We can also assume the below explanation as to how the Rust compiler expands the above cases.
The formar (without move; i.e. no transfer of ownership),
struct ClosureObject {
x: &u32
}
let x = 1;
let y = ClosureObject {
x: &x
};
The later (with move; i.e. transfer of ownership),
struct ClosureObject {
x: u32
}
let x = 1;
let y = ClosureObject {
x: x
};
Please let me answer my own question. I had trouble, but by asking a question here I did Rubber Duck Problem Solving. Now I understand:
A move is a transfer of ownership of the value.
For example the assignment let x = a; transfers ownership: At first a owned the value. After the let it's x who owns the value. Rust forbids to use a thereafter.
In fact, if you do println!("a: {:?}", a); after the letthe Rust compiler says:
error: use of moved value: `a`
println!("a: {:?}", a);
^
Complete example:
#[derive(Debug)]
struct Example { member: i32 }
fn main() {
let a = Example { member: 42 }; // A struct is moved
let x = a;
println!("a: {:?}", a);
println!("x: {:?}", x);
}
And what does this move mean?
It seems that the concept comes from C++11. A document about C++ move semantics says:
From a client code point of view, choosing move instead of copy means that you don't care what happens to the state of the source.
Aha. C++11 does not care what happens with source. So in this vein, Rust is free to decide to forbid to use the source after a move.
And how it is implemented?
I don't know. But I can imagine that Rust does literally nothing. x is just a different name for the same value. Names usually are compiled away (except of course debugging symbols). So it's the same machine code whether the binding has the name a or x.
It seems C++ does the same in copy constructor elision.
Doing nothing is the most efficient possible.
Passing a value to function, also results in transfer of ownership; it is very similar to other examples:
struct Example { member: i32 }
fn take(ex: Example) {
// 2) Now ex is pointing to the data a was pointing to in main
println!("a.member: {}", ex.member)
// 3) When ex goes of of scope so as the access to the data it
// was pointing to. So Rust frees that memory.
}
fn main() {
let a = Example { member: 42 };
take(a); // 1) The ownership is transfered to the function take
// 4) We can no longer use a to access the data it pointed to
println!("a.member: {}", a.member);
}
Hence the expected error:
post_test_7.rs:12:30: 12:38 error: use of moved value: `a.member`
let s1:String= String::from("hello");
let s2:String= s1;
To ensure memory safety, rust invalidates s1, so instead of being shallow copy, this called a Move
fn main() {
// Each value in rust has a variable that is called its owner
// There can only be one owner at a time.
let s=String::from('hello')
take_ownership(s)
println!("{}",s)
// Error: borrow of moved value "s". value borrowed here after move. so s cannot be borrowed after a move
// when we pass a parameter into a function it is the same as if we were to assign s to another variable. Passing 's' moves s into the 'my_string' variable then `println!("{}",my_string)` executed, "my_string" printed out. After this scope is done, some_string gets dropped.
let x:i32 = 2;
makes_copy(x)
// instead of being moved, integers are copied. we can still use "x" after the function
//Primitives types are Copy and they are stored in stack because there size is known at compile time.
println("{}",x)
}
fn take_ownership(my_string:String){
println!('{}',my_string);
}
fn makes_copy(some_integer:i32){
println!("{}", some_integer)
}

Resources