How does Rust respect the Copy trait? - rust

If you make a struct derive the Copy trait then Rust is going to make y as a copy of x in the code below, as opposed to moving from x to y otherwise:
#[derive(Debug, Copy, Clone)]
struct Foo;
let x = Foo;
let y = x;
If I were in C++ I'd say that Copy somehow makes Foo implement the = operator in a way that it copies the entire object on the right side.
In Rust, is it simply implemented as a rule in the compiler? When the compiler finds let y=x it simply checks if the Copy trait is derived or not and decides if copy or moves?
I'm intersted in Rust internals so I can understand the language better. This information can't be found on tutorials.

Yes, this is directly implemented in the compiler.
It affects any situation that would otherwise move the item, so it also affects passing parameters to functions or matching in a match expression – basically any situation that involves pattern matching. In that way, it's not really comparable to implementing the = operator in C++.
The definition of the Copy trait is marked as a "lang" item in the source code of the standard library. The compiler knows that the item marked with #[lang = "copy"] is the trait that decides whether a type is moved or copied. The compiler also knows some ruls about types that are implicitly Copy, like closures or tuples that only contain items that are Copy.

In Rust, is it simply implemented as a rule in the compiler? When the compiler finds let y=x it simply checks if the Copy trait is derived or not and decides if copy or moves?
At runtime, there is no semantic difference (though the applicable optimisations might vary), both move and copy are just a memcopy, and in either case the copy can be optimised away.
At compile-time, the compile is indeed aware of the Copy/!Copy distinction: in the case where x would be a !Copy type the assignment "consumes" the variable, meaning you can't use it afterwards.
If the item is Copy then it doesn't and you can.
That's about it.

If you want to dig into how this code is compiled, you could take a look at the MIR representation in the playground. In this slightly simplified version:
#[derive(Copy, Clone)]
struct Foo;
fn main() {
let x = Foo;
let y = x;
}
the slightly trimmed MIR output is:
bb0: {
StorageLive(_1);
_1 = const Scalar(<ZST>): Foo;
StorageLive(_2);
_2 = const Scalar(<ZST>): Foo;
StorageDead(_2);
StorageDead(_1);
return;
}
So in this specific case, the compiler has determined that Foo is a Zero Sized Type (ZST) for x and y (here, _1 and _2), so both are assigned a constant empty value, so there isn't any copying as-such.
To see it in the playground, click here then select "MIR" from the drop down triple dots button just to the right of the "Run" button. For more in depth information about MIR, take a look at the rustc dev guide.

Related

Rust behavior after move [duplicate]

The Rust language website claims move semantics as one of the features of the language. But I can't see how move semantics is implemented in Rust.
Rust boxes are the only place where move semantics are used.
let x = Box::new(5);
let y: Box<i32> = x; // x is 'moved'
The above Rust code can be written in C++ as
auto x = std::make_unique<int>(5);
auto y = std::move(x); // Note the explicit move
As far as I know (correct me if I'm wrong),
Rust doesn't have constructors at all, let alone move constructors.
No support for rvalue references.
No way to create functions overloads with rvalue parameters.
How does Rust provide move semantics?
I think it's a very common issue when coming from C++. In C++ you are doing everything explicitly when it comes to copying and moving. The language was designed around copying and references. With C++11 the ability to "move" stuff was glued onto that system. Rust on the other hand took a fresh start.
Rust doesn't have constructors at all, let alone move constructors.
You do not need move constructors. Rust moves everything that "does not have a copy constructor", a.k.a. "does not implement the Copy trait".
struct A;
fn test() {
let a = A;
let b = a;
let c = a; // error, a is moved
}
Rust's default constructor is (by convention) simply an associated function called new:
struct A(i32);
impl A {
fn new() -> A {
A(5)
}
}
More complex constructors should have more expressive names. This is the named constructor idiom in C++
No support for rvalue references.
It has always been a requested feature, see RFC issue 998, but most likely you are asking for a different feature: moving stuff to functions:
struct A;
fn move_to(a: A) {
// a is moved into here, you own it now.
}
fn test() {
let a = A;
move_to(a);
let c = a; // error, a is moved
}
No way to create functions overloads with rvalue parameters.
You can do that with traits.
trait Ref {
fn test(&self);
}
trait Move {
fn test(self);
}
struct A;
impl Ref for A {
fn test(&self) {
println!("by ref");
}
}
impl Move for A {
fn test(self) {
println!("by value");
}
}
fn main() {
let a = A;
(&a).test(); // prints "by ref"
a.test(); // prints "by value"
}
Rust's moving and copying semantics are very different from C++. I'm going to take a different approach to explain them than the existing answer.
In C++, copying is an operation that can be arbitrarily complex, due to custom copy constructors. Rust doesn't want custom semantics of simple assignment or argument passing, and so takes a different approach.
First, an assignment or argument passing in Rust is always just a simple memory copy.
let foo = bar; // copies the bytes of bar to the location of foo (might be elided)
function(foo); // copies the bytes of foo to the parameter location (might be elided)
But what if the object controls some resources? Let's say we are dealing with a simple smart pointer, Box.
let b1 = Box::new(42);
let b2 = b1;
At this point, if just the bytes are copied over, wouldn't the destructor (drop in Rust) be called for each object, thus freeing the same pointer twice and causing undefined behavior?
The answer is that Rust moves by default. This means that it copies the bytes to the new location, and the old object is then gone. It is a compile error to access b1 after the second line above. And the destructor is not called for it. The value was moved to b2, and b1 might as well not exist anymore.
This is how move semantics work in Rust. The bytes are copied over, and the old object is gone.
In some discussions about C++'s move semantics, Rust's way was called "destructive move". There have been proposals to add the "move destructor" or something similar to C++ so that it can have the same semantics. But move semantics as they are implemented in C++ don't do this. The old object is left behind, and its destructor is still called. Therefore, you need a move constructor to deal with the custom logic required by the move operation. Moving is just a specialized constructor/assignment operator that is expected to behave in a certain way.
So by default, Rust's assignment moves the object, making the old location invalid. But many types (integers, floating points, shared references) have semantics where copying the bytes is a perfectly valid way of creating a real copy, with no need to ignore the old object. Such types should implement the Copy trait, which can be derived by the compiler automatically.
#[derive(Copy)]
struct JustTwoInts {
one: i32,
two: i32,
}
This signals the compiler that assignment and argument passing do not invalidate the old object:
let j1 = JustTwoInts { one: 1, two: 2 };
let j2 = j1;
println!("Still allowed: {}", j1.one);
Note that trivial copying and the need for destruction are mutually exclusive; a type that is Copy cannot also be Drop.
Now what about when you want to make a copy of something where just copying the bytes isn't enough, e.g. a vector? There is no language feature for this; technically, the type just needs a function that returns a new object that was created the right way. But by convention this is achieved by implementing the Clone trait and its clone function. In fact, the compiler supports automatic derivation of Clone too, where it simply clones every field.
#[Derive(Clone)]
struct JustTwoVecs {
one: Vec<i32>,
two: Vec<i32>,
}
let j1 = JustTwoVecs { one: vec![1], two: vec![2, 2] };
let j2 = j1.clone();
And whenever you derive Copy, you should also derive Clone, because containers like Vec use it internally when they are cloned themselves.
#[derive(Copy, Clone)]
struct JustTwoInts { /* as before */ }
Now, are there any downsides to this? Yes, in fact there is one rather big downside: because moving an object to another memory location is just done by copying bytes, and no custom logic, a type cannot have references into itself. In fact, Rust's lifetime system makes it impossible to construct such types safely.
But in my opinion, the trade-off is worth it.
Rust supports move semantics with features like these:
All types are moveable.
Sending a value somewhere is a move, by default, throughout the language. For non-Copy types, like Vec, the following are all moves in Rust: passing an argument by value, returning a value, assignment, pattern-matching by value.
You don't have std::move in Rust because it's the default. You're really using moves all the time.
Rust knows that moved values must not be used. If you have a value x: String and do channel.send(x), sending the value to another thread, the compiler knows that x has been moved. Trying to use it after the move is a compile-time error, "use of moved value". And you can't move a value if anyone has a reference to it (a dangling pointer).
Rust knows not to call destructors on moved values. Moving a value transfers ownership, including responsibility for cleanup. Types don't have to be able to represent a special "value was moved" state.
Moves are cheap and the performance is predictable. It's basically memcpy. Returning a huge Vec is always fast—you're just copying three words.
The Rust standard library uses and supports moves everywhere. I already mentioned channels, which use move semantics to safely transfer ownership of values across threads. Other nice touches: all types support copy-free std::mem::swap() in Rust; the Into and From standard conversion traits are by-value; Vec and other collections have .drain() and .into_iter() methods so you can smash one data structure, move all the values out of it, and use those values to build a new one.
Rust doesn't have move references, but moves are a powerful and central concept in Rust, providing a lot of the same performance benefits as in C++, and some other benefits as well.
let s = vec!["udon".to_string(), "ramen".to_string(), "soba".to_string()];
this is how it is represented in memory
Then let's assign s to t
let t = s;
this is what happens:
let t = s MOVED the vector’s three header fields from s to t; now t is the owner of the vector. The vector’s elements stayed just
where they were, and nothing happened to the strings either. Every value still has a single owner.
Now s is freed, if I write this
let u = s
I get error: "use of moved value: s"
Rust applies move semantics to almost any use of a value (Except Copy types). Passing
arguments to functions moves ownership to the function’s parameters;
returning a value from a function moves ownership to the caller.
Building a tuple moves the values into the tuple. And so on.
Ref for example:Programming Rust by Jim Blandy, Jason Orendorff, Leonora F. S. Tindall
Primitive types cannot be empty and are fixed size while non primitives can grow and can be empty. since primitive types cannot be empty and are fixed size, therefore assigning memory to store them and handling them are relatively easy. however the handling of non primitives involves the computation of how much memory they will take as they grow and other costly operations.Wwith primitives rust will make a copy, with non primitive rust does a move
fn main(){
// this variable is stored in stack. primitive types are fixed size, we can store them on stack
let x:i32=10;
// s1 is stored in heap. os will assign memory for this. pointer of this memory will be stored inside stack.
// s1 is the owner of memory space in heap which stores "my name"
// if we dont clear this memory, os will have no access to this memory. rust uses ownership to free the memory
let s1=String::from("my name");
// s1 will be cleared from the stack, s2 will be added to the stack poniting the same heap memory location
// making new copy of this string will create extra overhead, so we MOVED the ownership of s1 into s2
let s2=s1;
// s3 is the pointer to s2 which points to heap memory. we Borrowed the ownership
// Borrowing is similar borrowing in real life, you borrow a car from your friend, but its ownership does not change
let s3=&s2;
// this is creating new "my name" in heap and s4 stored as the pointer of this memory location on the heap
let s4=s2.clone()
}
Same principle applies when we pass primitive or non-primitive type arguments to a function:
fn main(){
// since this is primitive stack_function will make copy of it so this will remain unchanged
let stack_num=50;
let mut heap_vec=vec![2,3,4];
// when we pass a stack variable to a function, function will make a copy of that and will use the copy. "move" does not occur here
stack_var_fn(stack_num);
println!("The stack_num inside the main fn did not change:{}",stack_num);
// the owner of heap_vec moved here and when function gets executed, it goes out of scope so the variable will be dropped
// we can pass a reference to reach the value in heap. so we use the pointer of heap_vec
// we use "&"" operator to indicate that we are passing a reference
heap_var_fn(&heap_vec);
println!("the heap_vec inside main is:{:?}",heap_vec);
}
// this fn that we pass an argument stored in stack
fn stack_var_fn(mut var:i32){
// we are changing the arguments value
var=56;
println!("Var inside stack_var_fn is :{}",var);
}
// this fn that we pass an arg that stored in heap
fn heap_var_fn(var:&Vec<i32>){
println!("Var:{:?}",var);
}
I would like to add that it is not necessary for move to memcpy. If the object on the stack is large enough, Rust's compiler may choose to pass the object's pointer instead.
In C++ the default assignment of classes and structs is shallow copy. The values are copied, but not the data referenced by pointers. So modifying one instance changes the referenced data of all copies. The values (f.e. used for administration) remain unchanged in the other instance, likely rendering an inconsistent state. A move semantic avoids this situation. Example for a C++ implementation of a memory managed container with move semantic:
template <typename T>
class object
{
T *p;
public:
object()
{
p=new T;
}
~object()
{
if (p != (T *)0) delete p;
}
template <typename V> //type V is used to allow for conversions between reference and value
object(object<V> &v) //copy constructor with move semantic
{
p = v.p; //move ownership
v.p = (T *)0; //make sure it does not get deleted
}
object &operator=(object<T> &v) //move assignment
{
delete p;
p = v.p;
v.p = (T *)0;
return *this;
}
T &operator*() { return *p; } //reference to object *d
T *operator->() { return p; } //pointer to object data d->
};
Such an object is automatically garbage collected and can be returned from functions to the calling program. It is extremely efficient and does the same as Rust does:
object<somestruct> somefn() //function returning an object
{
object<somestruct> a;
auto b=a; //move semantic; b becomes invalid
return b; //this moves the object to the caller
}
auto c=somefn();
//now c owns the data; memory is freed after leaving the scope

Is it available to drop a variable holding a primitive value in Rust?

Updated Question:
Or I can ask this way: for every type T, if it's Copy, then there is no way for it to be moved, right? I mean is there any way like the std::move in C++ can move a copyable value explicitly?
Original Question:
Presume we have below a piece of Rust code, in this code, I defined a variable x holding an i32 value. What I want to do is to drop its value and invalidate it. I tried to use ptr::drop_in_place to drop it through a pointer, but it doesn't work, why?
fn main() {
let mut x = 10;
use std::ptr;
unsafe {
ptr::drop_in_place(&mut x as *mut i32);
}
println!("{}", x); // x is still accessible here.
}
For every type T, if it's Copy, then there is no way for it to be moved, right?
That is one way to word it. The semantics of Copy are such that any move leaves the original object valid.
Because of this, and that Drop and Copy are mutually exclusive traits, there's no way to "drop" a Copy. The traditional method of calling std::mem::drop(x) won't work. The only meaningful thing you can do is let the variable fall out of scope:
fn main() {
{
let x = 10;
}
println!("{}", x); // x is no longer accessible here.
}
I mean is there any way like the std::move in C++ can move a copyable value explicitly?
The specifics of copying vs moving are quite different between C++ and Rust. All types are moveable in Rust, whereas its opt-in for C++. And moving and copying in Rust are always bitwise copies, there's no room for custom code. Moving in Rust leaves the source object invalid whereas its still useable as a value in C++.
I can go on, but I'll leave off one last bit: moving a primitive in C++ isn't different than a copy either.

What happens to the stack when a value is moved in Rust? [duplicate]

In Rust, there are two possibilities to take a reference
Borrow, i.e., take a reference but don't allow mutating the reference destination. The & operator borrows ownership from a value.
Borrow mutably, i.e., take a reference to mutate the destination. The &mut operator mutably borrows ownership from a value.
The Rust documentation about borrowing rules says:
First, any borrow must last for a scope no greater than that of the
owner. Second, you may have one or the other of these two kinds of
borrows, but not both at the same time:
one or more references (&T) to a resource,
exactly one mutable reference (&mut T).
I believe that taking a reference is creating a pointer to the value and accessing the value by the pointer. This could be optimized away by the compiler if there is a simpler equivalent implementation.
However, I don't understand what move means and how it is implemented.
For types implementing the Copy trait it means copying e.g. by assigning the struct member-wise from the source, or a memcpy(). For small structs or for primitives this copy is efficient.
And for move?
This question is not a duplicate of What are move semantics? because Rust and C++ are different languages and move semantics are different between the two.
Semantics
Rust implements what is known as an Affine Type System:
Affine types are a version of linear types imposing weaker constraints, corresponding to affine logic. An affine resource can only be used once, while a linear one must be used once.
Types that are not Copy, and are thus moved, are Affine Types: you may use them either once or never, nothing else.
Rust qualifies this as a transfer of ownership in its Ownership-centric view of the world (*).
(*) Some of the people working on Rust are much more qualified than I am in CS, and they knowingly implemented an Affine Type System; however contrary to Haskell which exposes the math-y/cs-y concepts, Rust tends to expose more pragmatic concepts.
Note: it could be argued that Affine Types returned from a function tagged with #[must_use] are actually Linear Types from my reading.
Implementation
It depends. Please keep in mind than Rust is a language built for speed, and there are numerous optimizations passes at play here which will depend on the compiler used (rustc + LLVM, in our case).
Within a function body (playground):
fn main() {
let s = "Hello, World!".to_string();
let t = s;
println!("{}", t);
}
If you check the LLVM IR (in Debug), you'll see:
%_5 = alloca %"alloc::string::String", align 8
%t = alloca %"alloc::string::String", align 8
%s = alloca %"alloc::string::String", align 8
%0 = bitcast %"alloc::string::String"* %s to i8*
%1 = bitcast %"alloc::string::String"* %_5 to i8*
call void #llvm.memcpy.p0i8.p0i8.i64(i8* %1, i8* %0, i64 24, i32 8, i1 false)
%2 = bitcast %"alloc::string::String"* %_5 to i8*
%3 = bitcast %"alloc::string::String"* %t to i8*
call void #llvm.memcpy.p0i8.p0i8.i64(i8* %3, i8* %2, i64 24, i32 8, i1 false)
Underneath the covers, rustc invokes a memcpy from the result of "Hello, World!".to_string() to s and then to t. While it might seem inefficient, checking the same IR in Release mode you will realize that LLVM has completely elided the copies (realizing that s was unused).
The same situation occurs when calling a function: in theory you "move" the object into the function stack frame, however in practice if the object is large the rustc compiler might switch to passing a pointer instead.
Another situation is returning from a function, but even then the compiler might apply "return value optimization" and build directly in the caller's stack frame -- that is, the caller passes a pointer into which to write the return value, which is used without intermediary storage.
The ownership/borrowing constraints of Rust enable optimizations that are difficult to reach in C++ (which also has RVO but cannot apply it in as many cases).
So, the digest version:
moving large objects is inefficient, but there are a number of optimizations at play that might elide the move altogether
moving involves a memcpy of std::mem::size_of::<T>() bytes, so moving a large String is efficient because it only copies a couple bytes whatever the size of the allocated buffer they hold onto
When you move an item, you are transferring ownership of that item. That's a key component of Rust.
Let's say I had a struct, and then I assign the struct from one variable to another. By default, this will be a move, and I've transferred ownership. The compiler will track this change of ownership and prevent me from using the old variable any more:
pub struct Foo {
value: u8,
}
fn main() {
let foo = Foo { value: 42 };
let bar = foo;
println!("{}", foo.value); // error: use of moved value: `foo.value`
println!("{}", bar.value);
}
how it is implemented.
Conceptually, moving something doesn't need to do anything. In the example above, there wouldn't be a reason to actually allocate space somewhere and then move the allocated data when I assign to a different variable. I don't actually know what the compiler does, and it probably changes based on the level of optimization.
For practical purposes though, you can think that when you move something, the bits representing that item are duplicated as if via memcpy. This helps explain what happens when you pass a variable to a function that consumes it, or when you return a value from a function (again, the optimizer can do other things to make it efficient, this is just conceptually):
// Ownership is transferred from the caller to the callee
fn do_something_with_foo(foo: Foo) {}
// Ownership is transferred from the callee to the caller
fn make_a_foo() -> Foo { Foo { value: 42 } }
"But wait!", you say, "memcpy only comes into play with types implementing Copy!". This is mostly true, but the big difference is that when a type implements Copy, both the source and the destination are valid to use after the copy!
One way of thinking of move semantics is the same as copy semantics, but with the added restriction that the thing being moved from is no longer a valid item to use.
However, it's often easier to think of it the other way: The most basic thing that you can do is to move / give ownership away, and the ability to copy something is an additional privilege. That's the way that Rust models it.
This is a tough question for me! After using Rust for a while the move semantics are natural. Let me know what parts I've left out or explained poorly.
Rust's move keyword always bothers me so, I decided to write my understanding which I obtained after discussion with my colleagues.
I hope this might help someone.
let x = 1;
In the above statement, x is a variable whose value is 1. Now,
let y = || println!("y is a variable whose value is a closure");
So, move keyword is used to transfer the ownership of a variable to the closure.
In the below example, without move, x is not owned by the closure. Hence x is not owned by y and available for further use.
let x = 1;
let y = || println!("this is a closure that prints x = {}". x);
On the other hand, in this next below case, the x is owned by the closure. x is owned by y and not available for further use.
let x = 1;
let y = move || println!("this is a closure that prints x = {}". x);
By owning I mean containing as a member variable. The example cases above are in the same situation as the following two cases. We can also assume the below explanation as to how the Rust compiler expands the above cases.
The formar (without move; i.e. no transfer of ownership),
struct ClosureObject {
x: &u32
}
let x = 1;
let y = ClosureObject {
x: &x
};
The later (with move; i.e. transfer of ownership),
struct ClosureObject {
x: u32
}
let x = 1;
let y = ClosureObject {
x: x
};
Please let me answer my own question. I had trouble, but by asking a question here I did Rubber Duck Problem Solving. Now I understand:
A move is a transfer of ownership of the value.
For example the assignment let x = a; transfers ownership: At first a owned the value. After the let it's x who owns the value. Rust forbids to use a thereafter.
In fact, if you do println!("a: {:?}", a); after the letthe Rust compiler says:
error: use of moved value: `a`
println!("a: {:?}", a);
^
Complete example:
#[derive(Debug)]
struct Example { member: i32 }
fn main() {
let a = Example { member: 42 }; // A struct is moved
let x = a;
println!("a: {:?}", a);
println!("x: {:?}", x);
}
And what does this move mean?
It seems that the concept comes from C++11. A document about C++ move semantics says:
From a client code point of view, choosing move instead of copy means that you don't care what happens to the state of the source.
Aha. C++11 does not care what happens with source. So in this vein, Rust is free to decide to forbid to use the source after a move.
And how it is implemented?
I don't know. But I can imagine that Rust does literally nothing. x is just a different name for the same value. Names usually are compiled away (except of course debugging symbols). So it's the same machine code whether the binding has the name a or x.
It seems C++ does the same in copy constructor elision.
Doing nothing is the most efficient possible.
Passing a value to function, also results in transfer of ownership; it is very similar to other examples:
struct Example { member: i32 }
fn take(ex: Example) {
// 2) Now ex is pointing to the data a was pointing to in main
println!("a.member: {}", ex.member)
// 3) When ex goes of of scope so as the access to the data it
// was pointing to. So Rust frees that memory.
}
fn main() {
let a = Example { member: 42 };
take(a); // 1) The ownership is transfered to the function take
// 4) We can no longer use a to access the data it pointed to
println!("a.member: {}", a.member);
}
Hence the expected error:
post_test_7.rs:12:30: 12:38 error: use of moved value: `a.member`
let s1:String= String::from("hello");
let s2:String= s1;
To ensure memory safety, rust invalidates s1, so instead of being shallow copy, this called a Move
fn main() {
// Each value in rust has a variable that is called its owner
// There can only be one owner at a time.
let s=String::from('hello')
take_ownership(s)
println!("{}",s)
// Error: borrow of moved value "s". value borrowed here after move. so s cannot be borrowed after a move
// when we pass a parameter into a function it is the same as if we were to assign s to another variable. Passing 's' moves s into the 'my_string' variable then `println!("{}",my_string)` executed, "my_string" printed out. After this scope is done, some_string gets dropped.
let x:i32 = 2;
makes_copy(x)
// instead of being moved, integers are copied. we can still use "x" after the function
//Primitives types are Copy and they are stored in stack because there size is known at compile time.
println("{}",x)
}
fn take_ownership(my_string:String){
println!('{}',my_string);
}
fn makes_copy(some_integer:i32){
println!("{}", some_integer)
}

How to force a move of a type which implements the Copy trait?

A custom type by default is moved through default assignment. By implementing the Copy trait, I get "shallow copy semantics" through default assignment. I may also get "deep copy semantics" by implementing the Clone trait.
Is there a way to force a move on a Copy type?
I tried using the move keyword and a closure (let new_id = move || id;) but I get an error message. I'm not into closures yet, but, from seeing them here and there, I thought that that would have worked.
I don't really understand your question, but you certainly seem confused. So I'll address what seems to be the root of this confusion:
The C++ notions of copy/move I think I get correctly, but this 'everything is a memcpy anyway' is, well, it hasn't been very intuitive any time I read it
When thinking about Rust's move semantics, ignore C++. The C++ story is way more complicated than Rust's, which is remarkably simple. However, explaining Rust's semantics in terms of C++ is a mess.
TL;DR: Copies are moves. Moves are copies. Only the type checker knows the difference. So when you want to "force a move" for a Copy type, you are asking for something you already have.
So we have three semantics:
let a = b where b is not Copy
let a = b where b is Copy
let a = b.clone() where b is Clone
Note: There is no meaningful difference between assignment and initialization (like in C++) - assignment just first drops the old value.
Note: Function call arguments work just like assignment. f(b) assigns b to the argument of f.
First things first.
The a = b always performs a memcpy.
This is true in all three cases.
When you do let a = b, b is memcpy'd into a.
When you do let a = b.clone(), the result of b.clone() is memcpy'd into a.
Moves
Imagine b was a Vec. A Vec looks like this:
{ &mut data, length, capacity }
When you write let a = b you thus end up with:
b = { &mut data, length, capacity }
a = { &mut data, length, capacity }
This means that a and b both reference &mut data, which means we have aliased mutable data.
The type-system doesn't like this so says we can't use b again. Any access to b will fail at compile-time.
Note: a and b don't have to alias heap data to make using both a bad idea. For example, they could both be file handles - a copy would result in the file being closed twice.
Note: Moves do have extra semantics when destructors are involved, but the compiler won't let you write Copy on types with destructors anyway.
Copies
Imagine b was an Option<i32>. An Option<i32> looks like this:
{ is_valid, data }
When you write let a = b you thus end up with:
b = { is_valid, data }
a = { is_valid, data }
These are both usable simultaneously. To tell the type system that this is the case, one marks Option<i32> as Copy.
Note: Marking something copy doesn't change what the code does. It only allows more code. If you remove a Copy implementation, your code will either error or do exactly the same thing. In the same vein, marking a non-Copy type as Copy will not change any compiled code.
Clones
Imagine you want to copy a Vec, then. You implement Clone, which produces a new Vec, and do
let a = b.clone()
This performs two steps. We start with:
b = { &mut data, length, capacity }
Running b.clone() gives us an additional rvalue temporary
b = { &mut data, length, capacity }
{ &mut copy, length, capacity } // temporary
Running let a = b.clone() memcpys this into a:
b = { &mut data, length, capacity }
{ &mut copy, length, capacity } // temporary
a = { &mut copy, length, capacity }
Further access of the temporary is thus prevented by the type system, since Vec is not Copy.
But what about efficiency?
One thing I skipped over so far is that moves and copies can be elided. Rust guarantees certain trivial moves and copies to be elided.
Because the compiler (after lifetime checking) sees the same result in both cases, these are elided in exactly the same way.
Wrap the copyable type in another type that doesn't implement Copy.
struct Noncopyable<T>(T);
fn main() {
let v0 = Noncopyable(1);
let v1 = v0;
println!("{}", v0.0); // error: use of moved value: `v0.0`
}
New Answer
Sometimes I just want it to scream at me "put a new value in here!".
Then the answer is "no". When moving a type that implements Copy, both the source and destination will always be valid. When moving a type that does not implement Copy, the source will never be valid and the destination will always be valid. There is no syntax or trait that means "let me pick if this type that implements Copy acts as Copy at this time".
Original Answer
I just want to sometimes say "yeah, this type is Copy, but I really don't need this value in this variable anymore. This function takes an arg by val, just take it."
It sounds like you are trying to do the job of the optimizer by hand. Don't worry about that, the optimizer will do that for you. This has the benefit of not needing to worry about it.
Moves and copies are basically just the same runtime operation under the covers. The compiler inserts code to make a bitwise copy from the first variable's address into the second variable's address. In the case of a move, the compiler also invalidates the first variable so that if it subsequently used it will be a compile error.
Even so, I think there would be still be validity if Rust language allowed a program to say the assignment was an explicit move instead of a copy. It could catch bugs by preventing inadvertant references to the wrong instance. It might also generate more efficient code in some instances if the compiler knows you don't need two copies and could jiggle the bindings around to avoid the bitwise copy.
e.g. if you could state a = move assignment or similar.
let coord = (99.9, 73.45);
let mut coord2 = move coord;
coord2.0 += 100.0;
println!("coord2 = {:?}", coord2);
println!("coord = {:?}", coord); // Error
At runtime, copies and moves, in Rust, have the same effect. However, at compile-time, in the case of a move, the variable which an object is moved from is marked as unusable, but not in the case of a copy.
When you're using Copy types, you always want value semantics, and object semantics when not using Copy types.
Objects, in Rust, don't have a consistent address: the addresses often change between moves because of the runtime behavior, i.e. they are owned by exactly one binding. This is very different from other languages!
In Rust when you use (or move, in Rust's terms) a value that is Copy, the original value is still valid. If you want to simulate the case that like other non-copyable values, to invalidate after a specific use, you can do:
let v = 42i32;
// ...
let m = v;
// redefine v such that v is no longer a valid (initialized) variable afterwards
// Unfortunately you have to write a type here. () is the easiest,
// but can be used unintentionally.
let v: ();
// If the ! type was stabilized, you can write
let v: !;
// otherwise, you can define your own:
enum NeverType {};
let v: NeverType;
// ...
If you later change v to something that is not Copy, you don't have to change the code above to avoid using the moved value.
Correction on some misunderstanding on the question
The difference between Clone and Copy is NOT "shallow copy" and "deep copy" semantics. Copy is "memcpy" semantics and Clone is whatever the implementors like, that is the only difference. Although, by definition, things which require a "deep copy" are not able to implement Copy.
When a type implements both Copy and Clone, it is expected that both have the same semantics except that Clone can have side effects. For a type that implements Copy, its Clone should not have "deep copy" semantics and the cloned result is expected to be the same as a copied result.
As an attempt, if you want to use the closure to help, you probably wanted to run the closure, like let new_id = (move || id)();. If id is copy then id is still valid after the move, so this does not help, at all.

Why is the Copy trait needed for default (struct valued) array initialization?

When I define a struct like this, I can pass it to a function by value without adding anything specific:
#[derive(Debug)]
struct MyType {
member: u16,
}
fn my_function(param: MyType) {
println!("param.member: {}", param.member);
}
When I want to create an array of MyType instances with a default value
fn main() {
let array = [MyType { member: 1234 }; 100];
println!("array[42].member: ", array[42].member);
}
The Rust compiler tells me:
error[E0277]: the trait bound `MyType: std::marker::Copy` is not satisfied
--> src/main.rs:11:17
|
11 | let array = [MyType { member: 1234 }; 100];
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `std::marker::Copy` is not implemented for `MyType`
|
= note: the `Copy` trait is required because the repeated element will be copied
When I implement Copy and Clone, everything works:
impl Copy for MyType {}
impl Clone for MyType {
fn clone(&self) -> Self {
MyType {
member: self.member.clone(),
}
}
}
Why do I need to specify an empty Copy trait implementation?
Is there a simpler way to do this or do I have to re-think something?
Why does it work when passing an instance of MyType to the function by value? My guess is that it is being moved, so there is no copy in the first place.
Contrary to C/C++, Rust has very explicit distinction between types which are copied and which are moved. Note that this is only a semantic distinction; on the implementation level move is a shallow bytewise copy, however, the compiler places certain restrictions on what you can do with variables you moved from.
By default every type is only moveable (non-copyable). It means that values of such types are moved around:
let x = SomeNonCopyableType::new();
let y = x;
x.do_something(); // error!
do_something_else(x); // error!
You see, the value which was stored in x has been moved to y, and so you can't do anything with x.
Move semantics is a very important part of ownership concept in Rust. You can read more on it in the official guide.
Some types, however, are simple enough so their bytewise copy is also their semantic copy: if you copy a value byte-by-byte, you will get a new completely independent value. For example, primitive numbers are such types. Such property is designated by Copy trait in Rust, i.e. if a type implements Copy, then values of this type are implicitly copyable. Copy does not contain methods; it exists solely to mark that implementing types have certain property and so it is usually called a marker trait (as well as few other traits which do similar things).
However, it does not work for all types. For example, structures like dynamically allocated vectors cannot be automatically copyable: if they were, the address of the allocation contained in them would be byte-copied too, and then the destructor of such vector will be run twice over the same allocation, causing this pointer to be freed twice, which is a memory error.
So by default custom types in Rust are not copyable. But you can opt-in for it using #[derive(Copy, Clone)] (or, as you noticed, using direct impl; they are equivalent, but derive usually reads better):
#[derive(Copy, Clone)]
struct MyType {
member: u16
}
(deriving Clone is necessary because Copy inherits Clone, so everything which is Copy must also be Clone)
If your type can be automatically copyable in principle, that is, it doesn't have an associated destructor and all of its members are Copy, then with derive your type will also be Copy.
You can use Copy types in array initializer precisely because the array will be initialized with bytewise copies of the value used in this initializer, so your type has to implement Copy to designate that it indeed can be automatically copied.
The above was the answer to 1 and 2. As for 3, yes, you are absolutely correct. It does work precisely because the value is moved into the function. If you tried to use a variable of MyType type after you passed it into the function, you would quickly notice an error about using a moved value.
Why do I need to specify an empty Copy trait implementation?
Copy is a special built-in trait such that T implementing Copy represents that it is safe to duplicate a value of type T with a shallow byte copy.
This simple definition mean that one just needs to tell the compiler those semantics are correct, since there's no fundamental change in run-time behaviour: both a move (a non-Copy type) and a "copy" are shallow byte copies, it's just a question of if the source is usable later. See an older answer for more details.
(The compiler will complain if the contents of MyType isn't Copy itself; previously it would be automatically implemented, but that all changed with opt-in built-in traits.)
Creating an array is duplicating the value via shallow copies, and this is guaranteed to be safe if T is Copy. It is safe in more general situations, #5244 covers some of them, but at the core, a non-Copy struct won't be able to be used to create a fixed-length array automatically because the compiler can't tell that the duplication is safe/correct.
Is there a simpler way to do this or do I have to re-think something (I'm coming from C)?
#[derive(Copy)]
struct MyType {
member: u16
}
will insert the appropriate empty implementation (#[derive] works with several other traits, e.g. one often sees #[derive(Copy, Clone, PartialEq, Eq)].)
Why does it work when passing an instance of MyType to the function by value? My guess is that it is being moved, so there is no copy in the first place.
Well, without calling the function one doesn't see the move vs. copy behaviour (if you were to call it twice the same non-Copy value, the compiler would emit an error about moved values). But, a "move" and a "copy" are essentially the same on the machine. All by-value uses of a value are shallow copies semantically in Rust, just like in C.

Resources