I was playing around with the Vec struct and ran into some interesting errors and behavior that I can't quite understand. Consider the following code.
fn main() {
let v = vec![box 1i];
let f = v[0];
}
When evaluated in the rust playpen, the code produces the following errors:
<anon>:3:13: 3:17 error: cannot move out of dereference (dereference is implicit, due to indexing)
<anon>:3 let f = v[0];
^~~~
<anon>:3:9: 3:10 note: attempting to move value to here (to prevent the move, use `ref f` or `ref mut f` to capture value by reference)
<anon>:3 let f = v[0];
^
error: aborting due to previous error
playpen: application terminated with error code 101
Program ended.
My understanding of Vec's index method is that it returns references to the values in a Vec, so I don't understand what moves or implicit dereferences are happening.
Also, when I change the f variable to an underscore, as below, no errors are produced!
fn main() {
let v = vec![box 1i];
let _ = v[0];
}
I was hoping someone could explain the errors I was getting and why they go away when switching f to _.
No idea which syntax sugar v[0] implements, but it is trying to move the value instead of getting a reference.
But if you call .index(), it works and gives you a reference with the same lifetime of the vector:
fn main() {
let v = vec![box 1i];
let f = v.index(&0);
println!("{}", f);
}
The second example works because as the value is being discarded, it doesn't try to move it.
EDIT:
The desugar for v[0] is *v.index(&0) (from: https://github.com/rust-lang/rust/blob/fb72c4767fa423649feeb197b50385c1fa0a6fd5/src/librustc/middle/trans/expr.rs#L467 ).
fn main() {
let a = vec!(1i);
let b = a[0] == *a.index(&0);
println!("{}" , b);
}
true
In your code, let f = v[0]; assigns f by value (as said in the error message, it is implicitly dereferencing) : the compiler tries to copy or move v[0] into f. v[0] being a box, it cannot be copied, thus it should be moved like in this situation :
let a = box 1i;
let b = a;
// a cannot be used anymore, it has been moved
// following line would not compile
println!("{}", a);
But values cannot be moved out of the vector via indexing, as it is a reference that is returned.
Concerning _, this code :
fn main() {
let v = vec![box 1i];
let _ = v[0];
println!("{}", _);
}
produces this error :
<anon>:4:20: 4:21 error: unexpected token: `_`
<anon>:4 println!("{}", _);
^
_ is not a variable name but a special name of rust, telling you don't care about the value, so the compiler doesn't try to copy or move anything.
You can get your original function to work by de-referencing your v[0]:
fn main() {
let v = vec![box 1i];
let f = &v[0]; // notice the &
println!("{}",f);
}
I don't know why the underscore silences your error. It should probably raise an error since the underscore alone is an invalid variable name (I think). Attempting to print it yields an error:
fn main() {
let v = vec![box 1i];
let _ = &v[0];
println!("{}",_);
}
Output:
<anon>:4:19: 4:20 error: unexpected token: `_`
<anon>:4 println!("{}",_);
The underscore is used to silence unused variable warnings (for example the compiler will yell at you if you define some_var and never use it, but won't if you define _some_var and never use it). It is also used as a fallback in a match statement to match anything that did not match other paths:
fn main() {
let v = vec![box 1i];
let f = &v[0];
match **f {
3i => println!("{}",f),
_ => println!("nothing here")
};
}
Someone smarter than me should comment on if the underscore is a valid variable name. Honestly I think the compiler shouldn't allow it.
Related
I'm writing a project in which a struct System can be constructed from a data file.
In the data file, some lines contain keywords that indicates values to be read either inside the line or in the subsequent N following lines (separated with a blank line from the line).
I would like to have a vec! containing the keywords (statically known at compile time), check if the line returned by the iterator contains the keyword and do the appropriate operations.
Now my code looks like this:
impl System {
fn read_data<P>(filename: P) -> io::Result<io::Lines<io::BufReader<File>>> where P: AsRef<Path> {
let file = File::open(filename)?;
let f = BufReader::new(file);
Ok(f.lines())
}
...
pub fn new_from_data<P>(dataname: P) -> System where P: AsRef<Path> {
let keywd = vec!["atoms", "atom types".into(),
"Atoms".into()];
let mut sys = System::new();
if let Ok(mut lines) = System::read_data(dataname) {
while let Some(line) = lines.next() {
for k in keywd {
let split: Vec<&str> = line.unwrap().split(" ").collect();
if split.contains(k) {
match k {
"atoms" => sys.natoms = split[0].parse().unwrap(),
"atom types" => sys.ntypes = split[0].parse().unwrap(),
"Atoms" => {
lines.next();
// assumes fields are: atom-ID molecule-ID atom-type q x y z
for _ in 1..=sys.natoms {
let atline = lines.next().unwrap().unwrap();
let data: Vec<&str> = atline.split(" ").collect();
let atid: i32 = data[0].parse().unwrap();
let molid: i32 = data[1].parse().unwrap();
let atype: i32 = data[2].parse().unwrap();
let charge: f32 = data[3].parse().unwrap();
let x: f32 = data[4].parse().unwrap();
let y: f32 = data[5].parse().unwrap();
let z: f32 = data[6].parse().unwrap();
let at = Atom::new(atid, molid, atype, charge, x, y, z);
sys.atoms.push(at);
};
},
_ => (),
}
}
}
}
}
sys
}
}
I'm very unsure on two points:
I don't know if I treated the line by line reading of the file in an idiomatic way as I tinkered some examples from the book and Rust by example. But returning an iterator makes me wonder when and how unwrap the results. For example, when calling the iterator inside the while loop do I have to unwrap twice like in let atline = lines.next().unwrap().unwrap();? I think that the compiler does not complain yet because of the 1st error it encounters which is
I cannot wrap my head around the type the give to the value k as I get a typical:
error[E0308]: mismatched types
--> src/system/system.rs:65:39
|
65 | if split.contains(k) {
| ^ expected `&str`, found `str`
|
= note: expected reference `&&str`
found reference `&str`
error: aborting due to previous error
How are we supposed to declare the substring and compare it to the strings I put in keywd? I tried to deference k in contains, tell it to look at &keywd etc but I just feel I'm wasting my time for not properly adressing the problem. Thanks in advance, any help is indeed appreciated.
Let's go through the issues one by one. I'll go through the as they appear in the code.
First you need to borrow keywd in the for loop, i.e. &keywd. Because otherwise keywd gets moved after the first iteration of the while loop, and thus why the compiler complains about that.
for k in &keywd {
let split: Vec<&str> = line.unwrap().split(" ").collect();
Next, when you call .unwrap() on line, that's the same problem. That causes the inner Ok value to get moved out of the Result. Instead you can do line.as_ref().unwrap() as then you get a reference to the inner Ok value and aren't consuming the line Result.
Alternatively, you can .filter_map(Result::ok) on your lines, to avoid (.as_ref()).unwrap() altogether.
You can add that directly to read_data and even simply the return type using impl ....
fn read_data<P>(filename: P) -> io::Result<impl Iterator<Item = String>>
where
P: AsRef<Path>,
{
let file = File::open(filename)?;
let f = BufReader::new(file);
Ok(f.lines().filter_map(Result::ok))
}
Note that you're splitting line for every keywd, which is needless. So you can move that outside of your for loop as well.
All in all, it ends up looking like this:
if let Ok(mut lines) = read_data("test.txt") {
while let Some(line) = lines.next() {
let split: Vec<&str> = line.split(" ").collect();
for k in &keywd {
if split.contains(k) {
...
Given that we borrowed &keywd, then we don't need to change k to &k, as now k is already &&str.
I'm wondering if someone can help me understand why this program behaves as it does:
fn main() {
let mut x = 456;
{
let mut y = Box::new(&x);
y = Box::new(&mut y);
println!("GOT {}",*y);
}
}
This program compiles under rust 1.35.0 (both 2015 and 2018 editions), and prints
GOT 456
But, I'm confused what's going on here. I'm guessing that this is an example of an auto-dereference. So, in reality, it looks like this:
fn main() {
let mut x = 456;
{
let mut y = Box::new(&x);
y = Box::new(&mut *y);
println!("GOT {}",*y);
}
}
Is that it?
This is a case of deref coercion, but one that is obfuscated by a few other unnecessary parts of the code. The following improvements should be made here:
The mut modifier on variable x is not needed because it is never modified.
The borrow of y in Box::new(&mut y) does not have to be mutable because the variable holds an immutable reference.
The println! implementation also knows to print values behind references, so the explicit * is not needed.
Then, we get the following code:
fn main() {
let x = 456;
{
let mut y = Box::new(&x);
y = Box::new(&y);
println!("GOT {}", y);
}
}
y is a variable of type Box<&i32> which is initially bound to a box created in the outer scope. The subsequent assignment to a new box works because the &y, of type &Box<&i32>, is coerced to &&i32, which can then be put in the box by automatically dereferencing the first borrow. This coercion is required because the variable x can only be assigned values of the same Box<&i32> type.
The lifetime of the reference inside both boxes also ended up being the same, because they refer to the same value in x.
See also:
What is the relation between auto-dereferencing and deref coercion?
I'm trying to understand how & and ref correspond. Here's an example where I thought these were equivalent, but one works and the other doesn't:
fn main() {
let t = "
aoeu
aoeu
aoeu
a";
let ls = t.lines();
dbg!(ls.clone().map(|l| &l[..]).collect::<Vec<&str>>().join("\n")); // works
dbg!(ls.clone().map(|ref l| l[..]).collect::<Vec<&str>>().join("\n")); // doesn't work
dbg!(ls.clone().map(|ref l| &l[..]).collect::<Vec<&str>>().join("\n")); // works again!
}
From the docs:
// A `ref` borrow on the left side of an assignment is equivalent to
// an `&` borrow on the right side.
let ref ref_c1 = c;
let ref_c2 = &c;
println!("ref_c1 equals ref_c2: {}", *ref_c1 == *ref_c2);
What would the equivalent to |l| &l[..] be with |ref l|? How does it correspond to the assignment examples in the docs?
Taking a look at the docs page for Lines(The iterator adapter for producing lines from a str), we can see that the item produced by it is:
type Item = &'a str;
Therefore the following happens when trying to do the "doesn't work" version:
dbg!(ls.clone().map(|ref l| l[..]).collect::<Vec<&str>>().join("\n")); # doesn't work
//Can become:
let temp = ls
.clone()
.map(|ref l| l[..])
.collect::<Vec<&str>>()
.join("\n");
println!("{}", temp);
Here we can see a crucial problem. l if of type &&str (Which I will explain below) so indexing into it will create a str, which is unsized and therefore cannot be outside of a pointer of some sort.
Now, onto the real thing you wanted to learn: What a ref pattern does:
When doing pattern matching or destructuring via the let binding, the ref keyword can be used to take references to the fields of a struct/tuple.
What this does is the following:
When we have let ref x = y, we take a reference to y.
When pattern matching on something (Like in your closure arguments you showed) we have a slightly different effect: the value under the reference is moved into the scope and then taken reference to while exposing a way to take the value under the reference. For example:
fn foo(ref x: String) {}
let y: fn(String) = foo;
This works because what is essentially being done is this:
fn foo(x: String) {
let x: &String = &x;
}
So what ref x does is take ownership of x and produce a reference to it.
On the other hand
When we have let &x = y we move a value out of y.
This is the opposite of ref, in that we take ownership of the value under y if we can. For example:
let x = 2;
let y = &x;
let &z = y; //Ok, we're moving a `Copy` type
This is only ok for copy types though as though this isn't exactly the same as let x = *y which would work for owned Boxes.
How could know the type of a binding if I use auto type deduction when creating a binding? what if the expression on the right side is a borrow(like let x = &5;), will it be value or a borrow? What will happen if I re-assign a borrow or a value?
Just for check, I do can re-assign a borrow if I use let mut x: &mut T = &mut T{}; or let mut x:&T = & T{};, right?
I sense some confusion between binding and assigning:
Binding introduces a new variable, and associates it to a value,
Assigning overwrites a value with another.
This can be illustrated in two simple lines:
let mut x = 5; // Binding
x = 10; // Assigning
A binding may appear in multiple places in Rust:
let statements,
if let/while let conditions,
cases in a match expression,
and even in a for expression, on the left side of in.
Whenever there is a binding, Rust's grammar also allows pattern matching:
in the case of let statements and for expressions, the patterns must be irrefutable,
in the case of if let, while let and match cases, the patterns may fail to match.
Pattern matching means that the type of the variable introduced by the binding differs based on how the binding is made:
let x = &5; // x: &i32
let &y = &5; // y: i32
Assigning always requires using =, the assignment operator.
When assigning, the former value is overwritten, and drop is called on it if it implements Drop.
let mut x = 5;
x = 6;
// Now x == 6, drop was not called because it's a i32.
let mut s = String::from("Hello, World!");
s = String::from("Hello, 神秘德里克!");
// Now s == "Hello, 神秘德里克!", drop was called because it's a String.
The value that is overwritten may be as simple as an integer or float, a more involved struct or enum, or a reference.
let mut r = &5;
r = &6;
// Now r points to 6, drop was not called as it's a reference.
Overwriting a reference does not overwrite the value pointed to by the reference, but the reference itself. The original value still lives on, and will be dropped when it's ready.
To overwrite the pointed to value, one needs to use *, the dereference operator:
let mut x = 5;
let r = &mut x;
*r = 6;
// r still points to x, and now x = 6.
If the type of the dereferenced value requires it, drop will be called:
let mut s = String::from("Hello, World!");
let r = &mut s;
*r = String::from("Hello, 神秘德里克!");
// r still points to s, and now s = "Hello, 神秘德里克!".
I invite you to use to playground to and toy around, you can start from here:
fn main() {
let mut s = String::from("Hello, World!");
{
let r = &mut s;
*r = String::from("Hello, 神秘德里克!");
}
println!("{}", s);
}
Hopefully, things should be a little clearer now, so let's check your samples.
let x = &5;
x is a reference to i32 (&i32). What happens is that the compiler will introduce a temporary in which 5 is stored, and then borrow this temporary.
let mut x: &mut T = T{};
Is impossible. The type of T{} is T not &mut T, so this fails to compile. You could change it to let mut x: &mut T = &mut T{};.
And your last example is similar.
I am trying to measure the speed of Vec's [] indexing vs. .get(index) using the following code:
extern crate time;
fn main() {
let v = vec![1; 1_000_000];
let before_rec1 = time::precise_time_ns();
for (i, v) in (0..v.len()).enumerate() {
v[i]
}
let after_rec1 = time::precise_time_ns();
println!("Total time: {}", after_rec1 - before_rec1);
let before_rec2 = time::precise_time_ns();
for (i, v) in (0..v.len()).enumerate() {
v.get(i)
}
let after_rec2 = time::precise_time_ns();
println!("Total time: {}", after_rec2 - before_rec2);
}
but this returns the following errors:
error: cannot index a value of type `usize`
--> src/main.rs:8:9
|
8 | v[i]
| ^^^^
error: no method named `get` found for type `usize` in the current scope
--> src/main.rs:17:11
|
17 | v.get(i)
| ^^^
I'm confused why this doesn't work, since enumerate should give me an index which, by its very name, I should be able to use to index the vector.
Why is this error being thrown?
I know I can/should use iteration rather than C-style way of indexing, but for learning's sake what do I use to iterate over the index values like I'm trying to do here?
You, pal, are mightily confused here.
fn main() {
let v = vec![1; 1_000_000];
This v has type Vec<i32>.
for (i, v) in (0..v.len()).enumerate() {
v[i]
}
You are iterating over a range of indexes, from 0 to v.len(), and using enumerate to generate indices as you go:
This v has type usize
In the loop, v == i, always
So... indeed, the compiler is correct, you cannot use [] on usize.
The program "fixed":
extern crate time;
fn main() {
let v = vec![1; 1_000_000];
let before_rec1 = time::precise_time_ns();
for i in 0..v.len() {
v[i]
}
let after_rec1 = time::precise_time_ns();
println!("Total time: {}", after_rec1 - before_rec1);
let before_rec2 = time::precise_time_ns();
for i in 0..v.len() {
v.get(i)
}
let after_rec2 = time::precise_time_ns();
println!("Total time: {}", after_rec2 - before_rec2);
}
I would add a disclaimer, though, that if I were a compiler, this useless loop would be optimized into a noop. If, after compiling with --release, your programs reports 0, this is what happened.
Rust has built-in benchmarking support, I advise that you use it rather than going the naive way. And... you will also need to inspect the assembly emitted, which is the only way to make sure that you are measuring what you think you are (optimizing compilers are tricky like that).