What is the difference between (&v).func() and &v.func()? - rust

What is (&v) actually doing in this code?
let v = vec!["hello", "viktor"];
let mut iterator = (&v).into_iter(); // Iter<&str>
let mut iterator = &v.into_iter(); // &IntoIter<&str>
How it is changing what is returned from .into_iter(). Why is the result different?

This is a precedence issue. With &v.into_iter(), the compiler understands &(v.into_iter()) not (&v).into_iter(), just like when you write 1+2*3, the compiler understands 1+(2*3), and not (1+2)*3.

Related

The size for values of type `[char]` cannot be known at compilation time

I have the following Rust code ...
const BINARY_SIZE: usize = 5;
let mut all_bits: Vec<[char; BINARY_SIZE]> = Vec::new();
let mut one_bits: [char; BINARY_SIZE] = ['0'; BINARY_SIZE];
all_bits.push(one_bits);
for i in [0..BINARY_SIZE] {
let one = all_bits[0];
let first_ok = one[0]; // This works, first_ok is '0'
let first_fail = one[i]; // This works not
}
How can I get from the variable 'one' the i'th character from the array?
The compiler gives me for let first_fail = one[i]; the error message ..
error[E0277]: the size for values of type [char] cannot be known at compilation time
Your problem is that you're using the Range syntax incorrectly. By wrapping 0..BINARY_SIZE in brackets, you're iterating over the elements in a slice of Ranges, rather than iterating over the values within the range you specified.
This means that i is of type Range rather than type usize. You can prove this by adding let i: usize = i; at the top of the loop. And indexing with a range returns a slice, rather than an element of your array.
Try removing the brackets like so:
const BINARY_SIZE: usize = 5;
let mut all_bits: Vec<[char; BINARY_SIZE]> = Vec::new();
let mut one_bits: [char; BINARY_SIZE] = ['0'; BINARY_SIZE];
all_bits.push(one_bits);
for i in 0..BINARY_SIZE {
let one = all_bits[0];
let first_ok = one[0]; // This works, first_ok is '0'
let first_fail = one[i]; // This works now
}
The error here really doesn't help much. But if you were using a helpful editor integration like rust-analyzer, you would see an inlay type hint showing i: Range.
Perhaps the rust compiler error message here can be improved to trace back through the index type.

Build a Hashset from a lines iterator

I don't understand why this doesn't work:
use std::collections::HashSet;
let test = "foo\nbar\n";
let hashset: HashSet<_> = test
.lines()
.collect::<Result<HashSet<_>, _>>()
.unwrap()
I get this error:
a value of type Result<HashSet<_>, _> cannot be built from an iterator over elements of type &str
I tried to use an intermediary Vec but I didn't succeed either. I understand the error but I don't know how to elegantly fix this
This works but isn't the fastest solution:
use std::collections::HashSet;
let test = "foo\nbar\n";
let hashset = HashSet::new();
for word in test.lines() {
hashset.insert(p.to_string());
}
The lines() method cannot fail, as it operates over a &str, therefore you should collect to a HashSet<&str>.
See https://doc.rust-lang.org/std/primitive.str.html#method.lines.
For example:
let test = "foo\nbar\n";
let hashset: HashSet<&str> = test
.lines()
.collect();
See it in action in the playground.
Your confusion here seems to come from the fact that there's a similar lines method that operates on BufRead which can fail due to operating on files, or other I/O based sources.
See https://doc.rust-lang.org/std/io/trait.BufRead.html#method.lines.
Apart from this difference, BufRead.lines varies as it yields owning Strings instead of borrowed &str.
If you want to create a HashSet which owns its contents, you can modify your code as this:
let test = "foo\nbar\n";
let hashset: HashSet<String> = test
.lines()
.map(String::from)
.collect();

Is there a zero-copy way to find the intersection of an arbitrary number of sets?

Here is a simple example demonstrating what I'm trying to do:
use std::collections::HashSet;
fn main() {
let mut sets: Vec<HashSet<char>> = vec![];
let mut set = HashSet::new();
set.insert('a');
set.insert('b');
set.insert('c');
set.insert('d');
sets.push(set);
let mut set = HashSet::new();
set.insert('a');
set.insert('b');
set.insert('d');
set.insert('e');
sets.push(set);
let mut set = HashSet::new();
set.insert('a');
set.insert('b');
set.insert('f');
set.insert('g');
sets.push(set);
// Simple intersection of two sets
let simple_intersection = sets[0].intersection(&sets[1]);
println!("Intersection of 0 and 1: {:?}", simple_intersection);
let mut iter = sets.iter();
let base = iter.next().unwrap().clone();
let intersection = iter.fold(base, |acc, set| acc.intersection(set).map(|x| x.clone()).collect());
println!("Intersection of all: {:?}", intersection);
}
This solution uses fold to "accumulate" the intersection, using the first element as the initial value.
Intersections are lazy iterators which iterate through references to the involved sets. Since the accumulator has to have the same type as the first element, we have to clone each set's elements. We can't make a set of owned data from references without cloning. I think I understand this.
For example, this doesn't work:
let mut iter = sets.iter();
let mut base = iter.next().unwrap();
let intersection = iter.fold(base, |acc, set| acc.intersection(set).collect());
println!("Intersection of all: {:?}", intersection);
error[E0277]: a value of type `&HashSet<char>` cannot be built from an iterator over elements of type `&char`
--> src/main.rs:41:73
|
41 | let intersection = iter.fold(base, |acc, set| acc.intersection(set).collect());
| ^^^^^^^ value of type `&HashSet<char>` cannot be built from `std::iter::Iterator<Item=&char>`
|
= help: the trait `FromIterator<&char>` is not implemented for `&HashSet<char>`
Even understanding this, I still don't want to clone the data. In theory it shouldn't be necessary, I have the data in the original vector, I should be able to work with references. That would speed up my algorithm a lot. This is a purely academic pursuit, so I am interested in getting it to be as fast as possible.
To do this, I would need to accumulate in a HashSet<&char>s, but I can't do that because I can't intersect a HashSet<&char> with a HashSet<char> in the closure. So it seems like I'm stuck. Is there any way to do this?
Alternatively, I could make a set of references for each set in the vector, but that doesn't really seem much better. Would it even work? I might run into the same problem but with double references instead.
Finally, I don't actually need to retain the original data, so I'd be okay moving the elements into the accumulator set. I can't figure out how to make this happen, since I have to go through intersection which gives me references.
Are any of the above proposals possible? Is there some other zero copy solution that I'm not seeing?
Finally, I don't actually need to retain the original data.
This makes it really easy.
First, optionally sort the sets by size. Then:
let (intersection, others) = sets.split_at_mut(1);
let intersection = &mut intersection[0];
for other in others {
intersection.retain(|e| other.contains(e));
}
You can do it in a fully lazy way using filter and all:
sets[0].iter().filter (move |c| sets[1..].iter().all (|s| s.contains (c)))
Playground
Finally, I don't actually need to retain the original data, so I'd be okay moving the elements into the accumulator set.
The retain method will work perfectly for your requirements then:
fn intersection(mut sets: Vec<HashSet<char>>) -> HashSet<char> {
if sets.is_empty() {
return HashSet::new();
}
if sets.len() == 1 {
return sets.pop().unwrap();
}
let mut result = sets.pop().unwrap();
result.retain(|item| {
sets.iter().all(|set| set.contains(item))
});
result
}
playground

What's the semantic of assignment in Rust?

How could know the type of a binding if I use auto type deduction when creating a binding? what if the expression on the right side is a borrow(like let x = &5;), will it be value or a borrow? What will happen if I re-assign a borrow or a value?
Just for check, I do can re-assign a borrow if I use let mut x: &mut T = &mut T{}; or let mut x:&T = & T{};, right?
I sense some confusion between binding and assigning:
Binding introduces a new variable, and associates it to a value,
Assigning overwrites a value with another.
This can be illustrated in two simple lines:
let mut x = 5; // Binding
x = 10; // Assigning
A binding may appear in multiple places in Rust:
let statements,
if let/while let conditions,
cases in a match expression,
and even in a for expression, on the left side of in.
Whenever there is a binding, Rust's grammar also allows pattern matching:
in the case of let statements and for expressions, the patterns must be irrefutable,
in the case of if let, while let and match cases, the patterns may fail to match.
Pattern matching means that the type of the variable introduced by the binding differs based on how the binding is made:
let x = &5; // x: &i32
let &y = &5; // y: i32
Assigning always requires using =, the assignment operator.
When assigning, the former value is overwritten, and drop is called on it if it implements Drop.
let mut x = 5;
x = 6;
// Now x == 6, drop was not called because it's a i32.
let mut s = String::from("Hello, World!");
s = String::from("Hello, 神秘德里克!");
// Now s == "Hello, 神秘德里克!", drop was called because it's a String.
The value that is overwritten may be as simple as an integer or float, a more involved struct or enum, or a reference.
let mut r = &5;
r = &6;
// Now r points to 6, drop was not called as it's a reference.
Overwriting a reference does not overwrite the value pointed to by the reference, but the reference itself. The original value still lives on, and will be dropped when it's ready.
To overwrite the pointed to value, one needs to use *, the dereference operator:
let mut x = 5;
let r = &mut x;
*r = 6;
// r still points to x, and now x = 6.
If the type of the dereferenced value requires it, drop will be called:
let mut s = String::from("Hello, World!");
let r = &mut s;
*r = String::from("Hello, 神秘德里克!");
// r still points to s, and now s = "Hello, 神秘德里克!".
I invite you to use to playground to and toy around, you can start from here:
fn main() {
let mut s = String::from("Hello, World!");
{
let r = &mut s;
*r = String::from("Hello, 神秘德里克!");
}
println!("{}", s);
}
Hopefully, things should be a little clearer now, so let's check your samples.
let x = &5;
x is a reference to i32 (&i32). What happens is that the compiler will introduce a temporary in which 5 is stored, and then borrow this temporary.
let mut x: &mut T = T{};
Is impossible. The type of T{} is T not &mut T, so this fails to compile. You could change it to let mut x: &mut T = &mut T{};.
And your last example is similar.

What is the differences between the two slice in Rust?

let s1 = String::from("hello world.");
let r1 = &s1;
let sl1 = &s1[..];
let sl2 = &r1[..];
let sl3 = r1[..];
println!("{}", sl3);
What is the difference between sl1 and sl2, and why sl3 is invalid? Isn't r1 a reference already, why need &?
The compiler dereferences the output of Index::index when desugaring the indexing syntax [] (see related question and its answers). Using explicit type annotations, the types of the bindings are thus as follows:
let r1: &str = &s1;
let sl1: &str = &s1[..];
let sl2: &str = &r1[..];
let sl3: str = r1[..];
str, being an unsized type, cannot be put on the stack and therefore cannot be used as the type for a local variable binding sl3, hence the compile error.

Resources