allocating data structures while making the borrow checker happy - rust

I'm writing my first rust program and as expected I'm having problems making the borrow checker happy. Here is what I'm trying to do:
I would like to have a function that allocates some array, stores the array in some global data structure, and returns a reference to it. Example:
static mut global_data = ...
fn f() -> &str {
let s = String::new();
global.my_string = s;
return &s;
};
Is there any way to make something like this work? If not, what is "the rust way"(tm) to get an array and a pointer into it?
Alternatively, is there any documentation I could read? The rust book is unfortunately very superficial on most topics.

There are a couple things wrong with your code:
Using global state is very unidiomatic in rust. It can be done in some specific scenarios, but it should never be a go to method. You cold try wrapping your state in Rc or Arc and share it this way in your program. If you also want to mutate this state (as you show in your example) you must to wrap it also in some kind of interior mutability type. So try Rc<RefCell<State>> if you want to use state in only one thread or Arc<Mutex<State>> if you want to use it from multiple different threads.
Accessing mutable static memory is unsafe. So even the following code won't compile:
static mut x: i32 = 0;
// neither of this lines work!
println!("{}", x);
x = 42;
You must use unsafe to access or modify any static mutable variables, because you must de facto prove to the compiler that you assure it that no data races (from accessing this data from different threads) will occur.
I can't be sure, since you didn't show what type is global_data, but I assume, that my_string is a field of type String. When you write
let s = String::new();
global.my_string = s;
You move ownership of that string to the global. You therefore cannot return (or even create) reference to it. You must do this though it's new owner. &global.my_string could work, but not if you do what I written in 1. You could try to return RefMut of MutexGuard, but that is probably not what you want.

Okay, just in case someone else is having the same question, the following code seems to work:
struct foo {
b : Option<Box<u32>>,
}
static mut global : foo = foo { b : None };
fn f<'a>() -> &'a u32 {
let b : Box<u32> = Box::new(5);
unsafe {
global.b = Some(b);
match &global.b {
None => panic!(""),
Some(a) => return &a,
}
}
}
At least it compiles. Hopefully it will also do the right thing when run.
I'm aware that this is not how you are supposed to do things in rust. But I'm currently trying to figure out how to implement various data structures from scratch, and the above is just a reduced example of one of the problems I encountered.

Related

What is the best way to resolve mutable borrow after immutable borrow, IF there is no perceived reference conflict

This question popped into my head (while I wasn't programming), and it actually made me question a lot of things about programming (like in C++, C#, Rust, in particular).
I want to point out, I'm aware there is a similar question on this issue:
Cannot borrow as mutable because it is also borrowed as immutable.
But I believe this question is aiming at a particular situation; a sub-problem. And I want to better understand how to resolve a thing like this in Rust.
The "thing" that I realised recently was that: "If I have a pointer/reference to an element in a dynamic array, and then I add an element, causing the array to expand and reallocate, that would break the pointer. Therefore, I need a special refererence that will always point to the same element even if it re-allocates".
This made me start thinking differently about a lot of things. But outside of that, I am aware that this problem is trivial to experienced c++ programmers. I have simply not come across this situation in my experiences, unfortunately.
So I wanted to see if Rust either had an existing 'special type' for this type of issue, and if not, what would happen if I made my own (for testing). The idea is that this "special pointer" would simply be a pointer to the Vector (List) itself, but also have a i32 field for the index; so it's all bundled under 1 variable that can be 'dereferenced' whenever you need.
Note: "VecPtr" is meant to be a immutable reference.
struct VecPtr<'a, T> {
vec: &'a Vec<T>,
index: usize
}
impl<T: Copy> VecPtr<'_, T> {
pub fn value(&self) -> T {
return self.vec[self.index];
}
}
fn main() {
let mut v = Vec::<i32>::with_capacity(6);
v.push(3);
v.push(1);
v.push(4);
v.push(1);
let r = VecPtr {vec: &v,index: 2};
let n = r.value();
println!("{}",n);
v.push(5); // error!
v.push(9); // error!
v.push(6); // re-allocation triggered // also error!
let n2 = r.value();
println!("{}",n2);
return;
}
So the above example code is showing that you can't have an existing immutable reference while also trying to have a mutable reference at the same time. good!
From what I've read from the other StackOverflow question, one of the reasons for the compiler error is that the Vector could re-allocate it's internal array at any time when it is calling "push". Which would invalidate all references to the internal array.
Which makes 100% sense. So as a programmer, you may desire to still have references to the array, but they are designed to be a bit more safer. Instead of a direct pointer to the internal array, you just have a pointer to the vector itself in question, and include an i32 index so you know the element you are looking at. Which means the dangling pointer issue that would occur at v.push(6); shouldn't happen any more. But yet the compiler still complains about the same issue. Which I understand.
I suppose it's still concerned about the reference to the vector itself, not the internals. Which makes things a bit confusing. Because there are different pointers here that the compiler is looking at and trying to protect. But to be honest, in the example code, the pointer to vec itself looks totally fine. That reference doesn't change at all (and it shouldn't, from what I can tell).
So my question is, is there a practice at which you can tell the compiler your intentions with certain references? So the compiler knows there isn't an issue (other than the unsafe keyword).
Or alternatively, is there a better way to do what I'm trying to do in the example code?
After some more research
It looks like one solution here would be to use reference counting Rc<T>, but I'm not sure that's 100% it.
I would normally not ask this question due to there being a similar existing question, but this one (I think) is investigating a slightly different situation, where someone (or me) would try to resolve an unsafe reference situation, but the compiler still insists there is an issue.
I guess the question comes down to this: would you find this acceptable?
fn main() {
let mut v = Vec::<i32>::with_capacity(6);
v.push(3);
v.push(1);
v.push(4);
v.push(1);
let r = VecPtr { vec: &v, index: 2 };
let n = r.value();
println!("{}",n);
v[2] = -1;
let n2 = r.value(); // This returned 4 just three lines ago and I was
// promised it wouldn't change! Now it's -1.
println!("{}",n2);
}
Or this
fn main() {
let mut v = Vec::<i32>::with_capacity(6);
v.push(3);
v.push(1);
v.push(4);
v.push(1);
let r = VecPtr { vec: &v, index: 2 };
let n = r.value();
println!("{}",n);
v.clear();
let n2 = r.value(); // This exact same thing that worked three lines ago will now panic.
println!("{}",n2);
}
Or, worst of all:
fn main() {
let mut v = Vec::<i32>::with_capacity(6);
v.push(3);
v.push(1);
v.push(4);
v.push(1);
let r = VecPtr { vec: &v, index: 2 };
let n = r.value();
println!("{}",n);
drop(v);
let n2 = r.value(); // Now you do actually have a dangling pointer.
println!("{}",n2);
}
Rust's answer is an emphatic "no" and that is enforced in the type system. It's not just about the unsoundness of dereferencing dangling pointers, it's a core design decision.
Can you tell the compiler your intentions with certain references? Yes! You can tell the compiler whether you want to share your reference, or whether you want to mutate through it. In your case, you've told the compiler that you want to share it. Which means you're not allowed to mutate it anymore. And as the examples above show, for good reason.
For the sake of this, the borrow checker has no notion of the stack or the heap, it doesn't know what types allocate and which don't, or when a Vec resizes. It only knows and cares about moving values and borrowing references: whether they're shared or mutable and for how long they live.
Now, if you want to make your structure work, Rust offers you some possibilities: One of those is RefCell. A RefCell allows you to borrow a mutable reference from an immutable one at the expense of runtime checking that nothing is aliased incorrectly. This together with an Rc can make your VecPtr:
use std::cell::RefCell;
use std::rc::Rc;
struct VecPtr<T> {
vec: Rc<RefCell<Vec<T>>>,
index: usize,
}
impl<T: Copy> VecPtr<T> {
pub fn value(&self) -> T {
return self.vec.borrow()[self.index];
}
}
fn main() {
let v = Rc::new(RefCell::new(Vec::<i32>::with_capacity(6)));
{
let mut v = v.borrow_mut();
v.push(3);
v.push(1);
v.push(4);
v.push(1);
}
let r = VecPtr {
vec: Rc::clone(&v),
index: 2,
};
let n = r.value();
println!("{}", n);
{
let mut v = v.borrow_mut();
v.push(5);
v.push(9);
v.push(6);
}
let n2 = r.value();
println!("{}", n2);
}
I'll leave it to you to look into how RefCell works.

Understanding the effects of shared references on a nested data structure

Ownership Tree
Hi,
I was trying to understand ownership concepts in Rust and came across this image (attached in this post) in "Programming Rust" book.
In particular am concerned about the "Borrowing a shared reference" part. In the book, the author says
Values borrowed by shared references are read-only. Across the
lifetime of a shared reference, neither its referent, nor anything
reachable from that referent, can be changed by anything. There exist
no live mutable references to anything in that structure, its owner is
held read-only, and so on. It’s really frozen
In the image, he goes on to highlight the path along the ownership tree that becomes immutable once a shared reference is taken to a particular section of the ownership tree. But what confused me is that the author also mentions that certain other parts of the ownership tree are not read only.
So I tried to test out with this code:
fn main(){
let mut v = Vec::new();
v.push(Vec::new());
v[0].push(vec!["alpha".to_string()]);
v[0].push(vec!["beta".to_string(), "gamma".to_string()]);
let r2 = &(v[0][1]); //Taking a shared reference here
v[0][0].push("pi".to_string());
println!("{:?}", r2)
}
I understand that v[0][0] cannot be mutable because v itself is a immutable shared reference (as a consequence of the shared reference to v[0][1]) and the Rust compiler helpfully points it out. My question is that when the author marks certain parts along the ownership tree as "not read only", how can we access these parts to change them?
If my code snippet is not a correct example for what the author intended to convey, kindly help me with an example that demonstrates what the author is trying to imply here. Thanks.
There are particular cases where you can split borrows, creating simultaneously existing references that can be any mix of mutable and immutable as long as they don't overlap. These are:
Anything where the compiler can statically track the lack of overlap: that is, fields in a struct, tuple, or enum.
Specifically written unsafe code which provides this feature, such as mutable-reference iterators over collections.
Your code as written does not compile because the compiler does not attempt to understand what indexing a Vec does, so it does not possess and cannot use the fact that v[0][0] does not overlap v[0][1].
Here is program which works with a direct translation of the tree shown in the figure:
#[derive(Debug)]
struct Things {
label: &'static str,
a: Option<Box<Things>>,
b: Option<Box<Things>>,
c: Option<Box<Things>>,
}
fn main() {
// Construct depicted structure
let mut root = Box::new(Things {
label: "root",
a: None,
b: None,
c: Some(Box::new(Things {
label: "root.c",
a: None,
b: None,
c: None,
})),
});
// "Borrowing a shared reference"
// .as_ref().unwrap() gets `&Things` out of `&Option<Things>`
// (there are several other ways this could be done)
let shared_reference = &root.c.as_ref().unwrap();
let mutable_reference = &mut root.a;
// Now, root and root.a are in the "inaccessible" state because they are
// borrowed. (We could still create an &root.b reference).
// Mutate while the shared reference must still exist
dbg!(shared_reference);
*mutable_reference = Some(Box::new(Things {
label: "new",
a: None,
b: None,
c: None,
}));
dbg!(shared_reference);
// Now the references are not used any more, so we can access the root.
// Let's look at the change we made.
dbg!(root);
}
This program is accepted by the compiler because it understands that struct fields do not overlap, so the root may be split.
It is possible to split borrows of vectors — just not with the indexing operator. You can do it with pattern matching, mutable iteration, or with .split_at_mut(). Here's that last option, which is the most “random access” capable one:
fn main() {
let mut v = Vec::new();
v.push(Vec::new());
v[0].push(vec!["alpha".to_string()]);
v[0].push(vec!["beta".to_string(), "gamma".to_string()]);
let (half1, half2): (&mut [Vec<String>], &mut [Vec<String>]) =
v[0].split_at_mut(1);
let r1 = &mut half1[0];
let r2 = &half2[0];
r1.push("pi".to_string());
println!("{:?}", r2);
}
This program works because split_at_mut() contains unsafe code that specifically creates two non-overlapping slices. This is one of the fundamental tools of Rust: using unsafe inside of libraries to create sound abstractions that wouldn't be possible using just the concepts the compiler understands.
With a pattern match instead, it would be:
if let [r1, r2] = &mut *v[0] {
r1.push("pi".to_string());
println!("{:?}", r2);
} else {
// Pattern failed because the length did not match
panic!("oops, v was not two elements long");
}
This compiles because the compiler understands that pattern-matching a slice (or a struct, or anything else matchable) creates non-overlapping references to each element. (Pattern matching is implemented by the compiler and never runs Rust code to make decisions about the structure being matched.)
(This version has an explicit failure branch; the previous version would panic on the split_at_mut() or on half2[0] if v[0] was too short.)
Someone should probably check my answer, as I am fairly new to Rust myself.
But...
I think this is because a Vec doesn't uphold the same invariance as, say, a tuple or nested structs.
Here's a tuple version of the example you gave (Although tuples don't support pushing, so I'm just incrementing an integer):
fn main() {
let mut v = (((1, 3), (5)));
let r2 = &v.0.1; //Taking a shared reference here
let v2 = &mut v.0.0;
*v2 += 1;
println!("{:?}", r2);
}
The above compiles. But if you attempt to borrow: let r2 = &v.0.0;, you'll get the same error as before.
Now, if you want to actually use nested vectors for trees. There are some crates to help with that, which do not incur runtime costs. Namely token_cell (or its inspiration, ghost_cell):
https://docs.rs/token-cell/1.1.0/token_cell/index.html
https://docs.rs/ghost-cell/latest/ghost_cell/
Here's the example with a token_cell wrapping the vec tree structure:
use token_cell::*;
generate_static_token!(Token);
fn main() {
let mut token = Token::new();
let token2 = Token::new();
let v = TokenCell::new(vec![vec![
vec!["beta".to_string()],
vec!["gamma".to_string()],
]]);
let r2 = &v.borrow(&token2)[0][1]; //Taking a shared reference here
v.borrow_mut(&mut token)[0][0].push("pi".to_string());
println!("{:?}", r2)
}
I hope this clears some confusion up at least.

How to return the contents of an Rc?

I am trying to return a moved value from an Rc:
if let Some(last_elem) = self.tail.take() {
let last = Rc::clone(&last_elem);
let tmp_node = last.borrow();
let tmp = tmp_node.deref();
return Some(*tmp);
}
Where:
self.tail has type Option<Rc<RefCell<Node<T>>>>;
after borrow the tmp_node has type Ref<Node<T>>; and
I would like to return Option<Node<T>>.
However the compiler complains, "cannot move out of *tmp which is behind a shared reference".
How can I fix this?
In general, it's impossible to move a value out of Rc, since it might be read concurrently from somewhere else.
However, if your code's logic can guarantee that this Rc is the sole owner of the underlying data, there's an escape hatch - Rc::try_unwrap, which performs the check at runtime and fails if the condition is not fulfilled. After that, we can easily unwrap the RefCell (not Ref!) with RefCell::into_inner:
pub fn unwrap<T>(last_elem: Rc<RefCell<T>>) -> T {
let inner: RefCell<T> = Rc::try_unwrap(last_elem)
.unwrap_or_else(|_| panic!("The last_elem was shared, failed to unwrap"));
inner.into_inner()
}
Playground
Another possible approach, if you want not to move value from Rc but to get a copy, would be to go with your original approach, but use clone instead of deref:
pub fn clone_out<T: Clone>(last_elem: Rc<RefCell<T>>) -> T {
last_elem.borrow().clone()
}
A side note: looks like you're trying to implement some kind of linked list. This is a notoriously hard problem to do in Rust, since it plays very bad with the single-ownership semantics. But if you're really sure you want to go though all the dirty details, this book is highly recommended.

Why do we need Rc<T> when immutable references can do the job?

To illustrate the necessity of Rc<T>, the Book presents the following snippet (spoiler: it won't compile) to show that we cannot enable multiple ownership without Rc<T>.
enum List {
Cons(i32, Box<List>),
Nil,
}
use crate::List::{Cons, Nil};
fn main() {
let a = Cons(5, Box::new(Cons(10, Box::new(Nil))));
let b = Cons(3, Box::new(a));
let c = Cons(4, Box::new(a));
}
It then claims (emphasis mine)
We could change the definition of Cons to hold references instead, but then we would have to specify lifetime parameters. By specifying lifetime parameters, we would be specifying that every element in the list will live at least as long as the entire list. The borrow checker wouldn’t let us compile let a = Cons(10, &Nil); for example, because the temporary Nil value would be dropped before a could take a reference to it.
Well, not quite. The following snippet compiles under rustc 1.52.1
enum List<'a> {
Cons(i32, &'a List<'a>),
Nil,
}
use crate::List::{Cons, Nil};
fn main() {
let a = Cons(5, &Cons(10, &Nil));
let b = Cons(3, &a);
let c = Cons(4, &a);
}
Note that by taking a reference, we no longer need a Box<T> indirection to hold the nested List. Furthermore, I can point both b and c to a, which gives a multiple conceptual owners (which are actually borrowers).
Question: why do we need Rc<T> when immutable references can do the job?
With "ordinary" borrows you can very roughly think of a statically proven order-by-relationship, where the compiler needs to prove that the owner of something always comes to life before any borrows and always dies after all borrows died (a owns String, it comes to life before b which borrows a, then b dies, then a dies; valid). For a lot of use-cases, this can be done, which is Rust's insight to make the borrow-system practical.
There are cases where this can't be done statically. In the example you've given, you're sort of cheating, because all borrows have a 'static-lifetime; and 'static items can be "ordered" before or after anything out to infinity because of that - so there actually is no constraint in the first place. The example becomes much more complex when you take different lifetimes (many List<'a>, List<'b>, etc.) into account. This issue will become apparent when you try to pass values into functions and those functions try to add items. This is because values created inside functions will die after leaving their scope (i.e. when the enclosing function returns), so we cannot keep a reference to them afterwards, or there will be dangling references.
Rc comes in when one can't prove statically who is the original owner, whose lifetime starts before any other and ends after any other(!). A classic example is a graph structure derived from user input, where multiple nodes can refer to one other node. They need to form a "born after, dies before" relationship with the node they are referencing at runtime, to guarantee that they never reference invalid data. The Rc is a very simple solution to that because a simple counter can represent these relationships. As long as the counter is not zero, some "born after, dies before" relationship is still active. The key insight here is that it does not matter in which order the nodes are created and die because any order is valid. Only the points on either end - where the counter gets to 0 - are actually important, any increase or decrease in between is the same (0=+1+1+1-1-1-1=0 is the same as 0=+1+1-1+1-1-1=0) The Rc is destroyed when the counter reaches zero. In the graph example this is when a node is not being referred to any longer. This tells the owner of that Rc (the last node referring) "Oh, it turns out I am the owner of the underlying node - nobody knew! - and I get to destroy it".
Even single-threaded, there are still times the destruction order is determined dynamically, whereas for the borrow checker to work, there must be a determined lifetime tree (stack).
fn run() {
let writer = Rc::new(std::io::sink());
let mut counters = vec![
(7, Rc::clone(&writer)),
(7, writer),
];
while !counters.is_empty() {
let idx = read_counter_index();
counters[idx].0 -= 1;
if counters[idx].0 == 0 {
counters.remove(idx);
}
}
}
fn read_counter_index() -> usize {
unimplemented!()
}
As you can see in this example, the order of destruction is determined by user input.
Another reason to use smart pointers is simplicity. The borrow checker does incur some code complexity. For example, using smart pointer, you are able to maneuver around the self-referential struct problem with a tiny overhead.
struct SelfRefButDynamic {
a: Rc<u32>,
b: Rc<u32>,
}
impl SelfRefButDynamic {
pub fn new() -> Self {
let a = Rc::new(0);
let b = Rc::clone(&a);
Self { a, b }
}
}
This is not possible with static (compile-time) references:
struct WontDo {
a: u32,
b: &u32,
}

What happens to the stack when a value is moved in Rust? [duplicate]

In Rust, there are two possibilities to take a reference
Borrow, i.e., take a reference but don't allow mutating the reference destination. The & operator borrows ownership from a value.
Borrow mutably, i.e., take a reference to mutate the destination. The &mut operator mutably borrows ownership from a value.
The Rust documentation about borrowing rules says:
First, any borrow must last for a scope no greater than that of the
owner. Second, you may have one or the other of these two kinds of
borrows, but not both at the same time:
one or more references (&T) to a resource,
exactly one mutable reference (&mut T).
I believe that taking a reference is creating a pointer to the value and accessing the value by the pointer. This could be optimized away by the compiler if there is a simpler equivalent implementation.
However, I don't understand what move means and how it is implemented.
For types implementing the Copy trait it means copying e.g. by assigning the struct member-wise from the source, or a memcpy(). For small structs or for primitives this copy is efficient.
And for move?
This question is not a duplicate of What are move semantics? because Rust and C++ are different languages and move semantics are different between the two.
Semantics
Rust implements what is known as an Affine Type System:
Affine types are a version of linear types imposing weaker constraints, corresponding to affine logic. An affine resource can only be used once, while a linear one must be used once.
Types that are not Copy, and are thus moved, are Affine Types: you may use them either once or never, nothing else.
Rust qualifies this as a transfer of ownership in its Ownership-centric view of the world (*).
(*) Some of the people working on Rust are much more qualified than I am in CS, and they knowingly implemented an Affine Type System; however contrary to Haskell which exposes the math-y/cs-y concepts, Rust tends to expose more pragmatic concepts.
Note: it could be argued that Affine Types returned from a function tagged with #[must_use] are actually Linear Types from my reading.
Implementation
It depends. Please keep in mind than Rust is a language built for speed, and there are numerous optimizations passes at play here which will depend on the compiler used (rustc + LLVM, in our case).
Within a function body (playground):
fn main() {
let s = "Hello, World!".to_string();
let t = s;
println!("{}", t);
}
If you check the LLVM IR (in Debug), you'll see:
%_5 = alloca %"alloc::string::String", align 8
%t = alloca %"alloc::string::String", align 8
%s = alloca %"alloc::string::String", align 8
%0 = bitcast %"alloc::string::String"* %s to i8*
%1 = bitcast %"alloc::string::String"* %_5 to i8*
call void #llvm.memcpy.p0i8.p0i8.i64(i8* %1, i8* %0, i64 24, i32 8, i1 false)
%2 = bitcast %"alloc::string::String"* %_5 to i8*
%3 = bitcast %"alloc::string::String"* %t to i8*
call void #llvm.memcpy.p0i8.p0i8.i64(i8* %3, i8* %2, i64 24, i32 8, i1 false)
Underneath the covers, rustc invokes a memcpy from the result of "Hello, World!".to_string() to s and then to t. While it might seem inefficient, checking the same IR in Release mode you will realize that LLVM has completely elided the copies (realizing that s was unused).
The same situation occurs when calling a function: in theory you "move" the object into the function stack frame, however in practice if the object is large the rustc compiler might switch to passing a pointer instead.
Another situation is returning from a function, but even then the compiler might apply "return value optimization" and build directly in the caller's stack frame -- that is, the caller passes a pointer into which to write the return value, which is used without intermediary storage.
The ownership/borrowing constraints of Rust enable optimizations that are difficult to reach in C++ (which also has RVO but cannot apply it in as many cases).
So, the digest version:
moving large objects is inefficient, but there are a number of optimizations at play that might elide the move altogether
moving involves a memcpy of std::mem::size_of::<T>() bytes, so moving a large String is efficient because it only copies a couple bytes whatever the size of the allocated buffer they hold onto
When you move an item, you are transferring ownership of that item. That's a key component of Rust.
Let's say I had a struct, and then I assign the struct from one variable to another. By default, this will be a move, and I've transferred ownership. The compiler will track this change of ownership and prevent me from using the old variable any more:
pub struct Foo {
value: u8,
}
fn main() {
let foo = Foo { value: 42 };
let bar = foo;
println!("{}", foo.value); // error: use of moved value: `foo.value`
println!("{}", bar.value);
}
how it is implemented.
Conceptually, moving something doesn't need to do anything. In the example above, there wouldn't be a reason to actually allocate space somewhere and then move the allocated data when I assign to a different variable. I don't actually know what the compiler does, and it probably changes based on the level of optimization.
For practical purposes though, you can think that when you move something, the bits representing that item are duplicated as if via memcpy. This helps explain what happens when you pass a variable to a function that consumes it, or when you return a value from a function (again, the optimizer can do other things to make it efficient, this is just conceptually):
// Ownership is transferred from the caller to the callee
fn do_something_with_foo(foo: Foo) {}
// Ownership is transferred from the callee to the caller
fn make_a_foo() -> Foo { Foo { value: 42 } }
"But wait!", you say, "memcpy only comes into play with types implementing Copy!". This is mostly true, but the big difference is that when a type implements Copy, both the source and the destination are valid to use after the copy!
One way of thinking of move semantics is the same as copy semantics, but with the added restriction that the thing being moved from is no longer a valid item to use.
However, it's often easier to think of it the other way: The most basic thing that you can do is to move / give ownership away, and the ability to copy something is an additional privilege. That's the way that Rust models it.
This is a tough question for me! After using Rust for a while the move semantics are natural. Let me know what parts I've left out or explained poorly.
Rust's move keyword always bothers me so, I decided to write my understanding which I obtained after discussion with my colleagues.
I hope this might help someone.
let x = 1;
In the above statement, x is a variable whose value is 1. Now,
let y = || println!("y is a variable whose value is a closure");
So, move keyword is used to transfer the ownership of a variable to the closure.
In the below example, without move, x is not owned by the closure. Hence x is not owned by y and available for further use.
let x = 1;
let y = || println!("this is a closure that prints x = {}". x);
On the other hand, in this next below case, the x is owned by the closure. x is owned by y and not available for further use.
let x = 1;
let y = move || println!("this is a closure that prints x = {}". x);
By owning I mean containing as a member variable. The example cases above are in the same situation as the following two cases. We can also assume the below explanation as to how the Rust compiler expands the above cases.
The formar (without move; i.e. no transfer of ownership),
struct ClosureObject {
x: &u32
}
let x = 1;
let y = ClosureObject {
x: &x
};
The later (with move; i.e. transfer of ownership),
struct ClosureObject {
x: u32
}
let x = 1;
let y = ClosureObject {
x: x
};
Please let me answer my own question. I had trouble, but by asking a question here I did Rubber Duck Problem Solving. Now I understand:
A move is a transfer of ownership of the value.
For example the assignment let x = a; transfers ownership: At first a owned the value. After the let it's x who owns the value. Rust forbids to use a thereafter.
In fact, if you do println!("a: {:?}", a); after the letthe Rust compiler says:
error: use of moved value: `a`
println!("a: {:?}", a);
^
Complete example:
#[derive(Debug)]
struct Example { member: i32 }
fn main() {
let a = Example { member: 42 }; // A struct is moved
let x = a;
println!("a: {:?}", a);
println!("x: {:?}", x);
}
And what does this move mean?
It seems that the concept comes from C++11. A document about C++ move semantics says:
From a client code point of view, choosing move instead of copy means that you don't care what happens to the state of the source.
Aha. C++11 does not care what happens with source. So in this vein, Rust is free to decide to forbid to use the source after a move.
And how it is implemented?
I don't know. But I can imagine that Rust does literally nothing. x is just a different name for the same value. Names usually are compiled away (except of course debugging symbols). So it's the same machine code whether the binding has the name a or x.
It seems C++ does the same in copy constructor elision.
Doing nothing is the most efficient possible.
Passing a value to function, also results in transfer of ownership; it is very similar to other examples:
struct Example { member: i32 }
fn take(ex: Example) {
// 2) Now ex is pointing to the data a was pointing to in main
println!("a.member: {}", ex.member)
// 3) When ex goes of of scope so as the access to the data it
// was pointing to. So Rust frees that memory.
}
fn main() {
let a = Example { member: 42 };
take(a); // 1) The ownership is transfered to the function take
// 4) We can no longer use a to access the data it pointed to
println!("a.member: {}", a.member);
}
Hence the expected error:
post_test_7.rs:12:30: 12:38 error: use of moved value: `a.member`
let s1:String= String::from("hello");
let s2:String= s1;
To ensure memory safety, rust invalidates s1, so instead of being shallow copy, this called a Move
fn main() {
// Each value in rust has a variable that is called its owner
// There can only be one owner at a time.
let s=String::from('hello')
take_ownership(s)
println!("{}",s)
// Error: borrow of moved value "s". value borrowed here after move. so s cannot be borrowed after a move
// when we pass a parameter into a function it is the same as if we were to assign s to another variable. Passing 's' moves s into the 'my_string' variable then `println!("{}",my_string)` executed, "my_string" printed out. After this scope is done, some_string gets dropped.
let x:i32 = 2;
makes_copy(x)
// instead of being moved, integers are copied. we can still use "x" after the function
//Primitives types are Copy and they are stored in stack because there size is known at compile time.
println("{}",x)
}
fn take_ownership(my_string:String){
println!('{}',my_string);
}
fn makes_copy(some_integer:i32){
println!("{}", some_integer)
}

Resources