How do I extend a HashSet<String> with another HashSet<String>? - string

When I try to extend a HashSet<String> with another HashSet<String>:
use std::collections::HashSet;
let mut a = HashSet::new();
a.insert("foo".to_owned());
let mut b = HashSet::new();
b.insert("bar".to_owned());
let c = a.extend(&b);
I get:
error[E0271]: type mismatch resolving `<&HashSet<String> as IntoIterator>::Item == String`
--> src/main.rs:7:11
|
7 | let c = a.extend(&b);
| ^^^^^^ expected reference, found struct `String`
|
= note: expected reference `&String`
found struct `String`
How do I do this?

HashSet::extend() comes from the Extend trait and therefore accepts any iterator (or iterable), not just a set.
If you pass it the reference to a set, that reference is iterable, but the iterator produces references to the elements of the set - in this case &String. Since HashSet::<String>::extend() expects an iterator that produces actual strings, that doesn't compile. There are two ways to fix the issue:
by passing the set b to extend by value: a.extend(b);
by creating an explicit iterator out of b using b.iter() and cloning the values it produces: a.extend(b.iter().cloned()).
In the first variant, b will be consumed and therefore no longer usable by the caller, but the strings from it will be reused in a. In the second variant, b will remain usable, but its strings will be copied for storage in a.
Note that in neither variant does it make sense to capture the return value of extend(), since it operates by side effect and doesn't return a useful value.

Related

About the change of the ownership of the array and vec

This code causes a error. It seems reasonable, because the ownership has moved:
fn main() {
let mut arr = vec![1, 2];
let mut arr2 = vec![2, 6];
arr = arr2;
arr2[1] = 2;
}
error[E0382]: borrow of moved value: `arr2`
--> src/main.rs:5:5
|
3 | let mut arr2 = vec![2, 6];
| -------- move occurs because `arr2` has type `Vec<i32>`, which does not implement the `Copy` trait
4 | arr = arr2;
| ---- value moved here
5 | arr2[1] = 2;
| ^^^^ value borrowed here after move
This code won't cause an error:
fn main() {
let mut arr = [1, 2];
let mut arr2 = [2, 4];
arr = arr2;
arr2[1] = 2;
}
This will cause an error:
fn main() {
let mut arr = ["1".to_string(), "2".to_string()];
let mut arr2 = ["2".to_string(), "4".to_string()];
arr = arr2;
arr2[1] = "2".to_string();
}
error[E0382]: use of moved value: `arr2`
--> src/main.rs:5:5
|
3 | let mut arr2 = ["2".to_string(), "4".to_string()];
| -------- move occurs because `arr2` has type `[String; 2]`, which does not implement the `Copy` trait
4 | arr = arr2;
| ---- value moved here
5 | arr2[1] = "2".to_string();
| ^^^^^^^ value used here after move
This works fine...
fn main() {
let mut arr = ["1", "2"];
let mut arr2 = ["2", "4"];
arr = arr2;
arr2[1] = "2";
}
I am completely confused.
In Rust all types fall into one of two categories:
copy-semantic (if it implements the Copy-trait)
move-semantic (if it does not implement the Copy-trait)
These semantics are implicitly employed by Rust whenever you e.g. assign a value to a variable. You can't choose it at the assignment. Instead, it depends only on the type in question. Therefore, whether a in the example below is moved or copied, depends entirely on what T is:
let a: T = /* some value */;
let b = a; // move or copy?
Also notice, that generic types are rather a set of similar types, than a single type by themselves (i.e. it is not a concrete type). For instance, you can't tell whether [T;2] is Copy or not, since it is not a concrete type (would need to know T first). Thus, [&str;2] and [String;2] are two totally different concrete types, and consequently one can be Copy and the other not.
To further elaborate, a concrete type can only be Copy if all constituting types are Copy. For instance, arrays [T;2] might be Copy (if T is too), Vec may never be Copy (independent of T).
So in your examples, when it does not compile due to move-semantics, you either have a Vec that is not Copy or String that is not, and any combinations with them can not be Copy either. Only if you combine Copy-types (arrays, ints, &str) you get copy-semantics, and your examples compile.
Also, this not an issue about ownership, because if you have copy-semantics you can just generate new values (by coping them) wherever you need, which gives you (fresh) ownership of these new values.
Couple of things here. First of all you are not changing the ownership in the working examples. You are merely borrowing their values. The difference in your not working examples is that in them you are actually changing the ownership.
As #eggyal correctly pointed out String and Vec don't implement Copy and that's important because Copy is used automatically when assigning another variable to a new variable. and in your working examples you have i32 (default for numeric types) and &str(also known as a string slice) which both implement Copy.
Every string literal is initially a &str and not a String. The .to_string() method in your example converts a &str into a String. If you want more in-depth information about the differences between &str and String I suggest you either check this answer or read an article about it but what's most important in your case is that string slices point to a read-only portion of the memory making them effectively immutable and therefore safe for copying.
The compiler nowadays is so good that it tells you the entire story. I compiled your code and got this wonderful error message:
|
18 | let mut a = vec![2, 4];
| ----- move occurs because `a` has type `Vec<i32>`, which does not implement the `Copy` trait
19 | let b = a;
| - value moved here
20 | println!("{:?}", a);
| ^ value borrowed here after move

Why can't the compiler automatically infer the type returned by Iterator::collect? [duplicate]

I want to get a length of a string which I've split:
fn fn1(my_string: String) -> bool {
let mut segments = my_string.split(".");
segments.collect().len() == 55
}
error[E0282]: type annotations needed
--> src/lib.rs:3:14
|
3 | segments.collect().len() == 55
| ^^^^^^^ cannot infer type for type parameter `B` declared on the associated function `collect`
|
= note: type must be known at this point
Previous compiler versions report the error:
error[E0619]: the type of this value must be known in this context
--> src/main.rs:3:5
|
3 | segments.collect().len() == 55
| ^^^^^^^^^^^^^^^^^^^^^^^^
How can I fix this error?
On an iterator, the collect method can produce many types of collections:
fn collect<B>(self) -> B
where
B: FromIterator<Self::Item>,
Types that implement FromIterator include Vec, String and many more. Because there are so many possibilities, something needs to constrain the result type. You can specify the type with something like .collect::<Vec<_>>() or let something: Vec<_> = some_iter.collect().
Until the type is known, you cannot call the method len() because it's impossible to know if an unknown type has a specific method.
If you’re purely wanting to find out how many items there are in an iterator, use Iterator.count(); creating a vector for the purpose is rather inefficient.

Does a '&&x' pattern match cause x to be copied?

In the documentation for std::iter::Iterator::filter() it explains that values are passed to the closure by reference, and since many iterators produce references, in that case the values passed are references to references. It offers some advice to improve ergonomics, by using a &x pattern to remove one level of indirection, or a &&x pattern to remove two levels of indirection.
However, I've found that this second pattern does not compile if the item being iterated does not implement Copy:
#[derive(PartialEq)]
struct Foo(i32);
fn main() {
let a = [Foo(0), Foo(1), Foo(2)];
// This works
let _ = a.iter().filter(|&x| *x != Foo(1));
// This also works
let _ = a.iter().filter(|&x| x != &Foo(1));
// This does not compile
let _ = a.iter().filter(|&&x| x != Foo(1));
}
The error you get is:
error[E0507]: cannot move out of a shared reference
--> src/main.rs:14:30
|
14 | let _ = a.iter().filter(|&&x| x != Foo(1));
| ^^-
| | |
| | data moved here
| | move occurs because `x` has type `Foo`, which does not implement the `Copy` trait
| help: consider removing the `&`: `&x`
Does this mean that if I use the &&x destructuring pattern, and the value is Copy, Rust will silently copy every value I am iterating over? If so, why does that happen?
In Rust, function or closure arguments are irrefutable patterns.
In the Rust reference, it says:
By default, identifier patterns bind a variable to a copy of or move
from the matched value depending on whether the matched value
implements Copy.
So, in this case:
let _ = a.iter().filter(|&x| *x != Foo(1));
the closure is being passed a reference to a reference to the item being iterated; therefore x is bound to a copy of a reference to the item. You can always copy a reference (it is basically a no-op) so this always succeeds.
In this case:
let _ = a.iter().filter(|&&x| x != Foo(1));
x is being bound to a copy of the item itself - which fails if the item is not Copy.
The reference also says:
This can be changed to bind to a reference by using the ref keyword,
or to a mutable reference using ref mut.
That's not so useful in this case though: &&ref x results in x being a reference to the item, the same as if you had used &x.
Does this mean that if I use the &&x destructuring pattern, and the value is Copy, Rust will silently copy every value I am iterating over?
Yes, that's correct, although the compiler backend might optimize the copying away.
If so, why does that happen?
Writing & before a pattern dereferences it. This copies the value it refers to, because a reference only borrows its value, so it can't move the value.
Example:
#[derive(Copy, Clone)]
struct Foo(i32);
let foo: Foo = Foo(5); // value is owned by foo
let borrow: &Foo = &foo; // value is borrowed
let bar: Foo = *borrow; // value is copied, the copy is owned by bar
// the original value is still owned by foo
Your first example is a bit special:
let _ = a.iter().filter(|&x| *x != Foo(1));
This seems at first as if it should require the Copy trait because x is dereferenced twice, first in the pattern, and then in the comparison.
However, the comparison is done by the PartialEq trait, which takes its arguments by reference. So the above is desugared to:
let _ = a.iter().filter(|&x| PartialEq::eq(&*x, &Foo(1)));
Which works because &* cancel each other out. Hence, x is only dereferenced once (in the pattern).

Immutable access in rust

I am new to rust from python and have used the functional style in python extensively.
What I am trying to do is to take in a string (slice) (or any iterable) and iterate with a reference to the current index and the next index. Here is my attempt:
fn main() {
// intentionally immutable, this should not change
let x = "this is a
multiline string
with more
then 3 lines.";
// initialize multiple (mutable) iterators over the slice
let mut lineiter = x.chars();
let mut afteriter = x.chars();
// to have some reason to do this
afteriter.skip(1);
// zip them together, comparing the current line with the next line
let mut zipped = lineiter.zip(afteriter);
for (char1, char2) in zipped {
println!("{:?} {:?}", char1, char2);
}
}
I think it should be possible to get different iterators that have different positions in the slice but are referring to the same parts of memory without having to copy the string, but the error I get is as follows:
error[E0382]: use of moved value: `afteriter`
--> /home/alex/Documents/projects/simple-game-solver/src/src.rs:15:35
|
10 | let afteriter = x.chars();
| --------- move occurs because `afteriter` has type `std::str::Chars<'_>`, which does not implement the `Copy` trait
11 | // to have some reason to do this
12 | afteriter.skip(1);
| --------- value moved here
...
15 | let mut zipped = lineiter.zip(afteriter);
| ^^^^^^^^^ value used here after move
I also get a warning telling me that zipped does not need to be mutable.
Is it possible to instantiate multiple iterators over a single variable and if so how can it be done?
Is it possible to instantiate multiple iterators over a single variable and if so how can it be done?
If you check the signature and documentation for Iterator::skip:
fn skip(self, n: usize) -> Skip<Self>
Creates an iterator that skips the first n elements.
After they have been consumed, the rest of the elements are yielded. Rather than overriding this method directly, instead override the nth method.
You can see that it takes self by value (consumes the input iterator) and returns a new iterator. This is not a method which consumes the first n elements of the iterator in-place, it's one which converts the existing iterator into one which skips the first n elements.
So instead of:
let mut afteriter = x.chars();
afteriter.skip(1);
you just write:
let mut afteriter = x.chars().skip(1);
I also get a warning telling me that zipped does not need to be mutable.
That's because Rust for loop uses the IntoIterator trait, which moves the iterable into the loop. It's not creating a mutable reference, it's just consuming whatever the RHS is.
Therefore it doesn't care what the mutability of the variable. You do need mut if you iterate explicitly, or if you call some other "terminal" method (e.g. nth or try_fold or all), or if you want to iterate on the mutable reference (that's mostly useful for collections though), but not to hand off iterators to some other combinator method, or to a for loop.
A for loop takes self, if you will. Just as for_each does in fact.
Thanks to #Stargateur for giving me the solution. The .skip(1) takes ownership of afteriter and returns ownership to a version without the first element. What was happening before was ownership was lost on the .skip and so the variable could not be mutated anymore (I am pretty sure)

How to append to string values in a hash table in Rust?

I have source files that contain text CSV lines for many products for a given day. I want to use Rust to collate these files so that I end up with many new destination CSV files, one per product, each containing portions of the lines only specific to that product.
My current solution is to loop over the lines of the source files and use a HashMap<String, String> to gather the lines for each product. I split each source line and use the element containing the product ID as a key, to obtain an Entry (occupied or vacant) in my HashMap. If it is vacant, I initialize the value with a new String that is allocated up-front with a given capacity, so that I can efficiently append to it thereafter.
// so far, so good (the first CSV item is the product ID)
let mystringval = productmap.entry(splitsource[0].to_owned()).or_insert(String::with_capacity(SOME_CAPACITY));
I then want to append formatted elements of the same source line to this Entry. There are many examples online, such as
https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.entry
of how to make this work if the HashMap value is an integer:
// this works if you obtain an Entry from a HashMap containing int vals
*myval += 1;
I haven't figured out how to append more text to the Entry I obtain from my HashMap<String, String> using this kind of syntax, and I've done my best to research examples online. There are surprisingly few examples anywhere of manipulating non-numeric entries in Rust data structures.
// using the Entry obtained from my first code snippet above
*mystringval.push_str(sourcePortion.as_str());
Attempting to compile this produces the following error:
error: type `()` cannot be dereferenced
--> coll.rs:102:17
|
102 | *mystringval.push_str(sourcePortion.as_str());
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
How can I append to a String inside the Entry value?
*mystringval.push_str(sourcePortion.as_str()); is parsed as *(mystringval.push_str(sourcePortion.as_str())); and since String::push_str returns (), you get the () cannot be dereferenced error.
Using parentheses around the dereference solves the precedence issue:
(*mystringval).push_str(sourcePortion.as_str());
The reason *myval += 1 works is because unary * has a higher precedence than +=, which means it's parsed as
(*myval) += 1
Since or_insert returns &mut V, you don't need to dereference it before calling its methods. The following also works:
mystringval.push_str(sourcePortion.as_str());
If you inspect the type returned by or_insert:
fn update_count(map: &mut HashMap<&str, u32>) {
let () = map.entry("hello").or_insert(0);
}
You will see it is a mutable reference:
error[E0308]: mismatched types
--> src/main.rs:4:9
|
4 | let () = map.entry("hello").or_insert(0);
| ^^ expected &mut u32, found ()
|
= note: expected type `&mut u32`
found type `()`
That means that you can call any method that needs a &mut self receiver with no extra syntax:
fn update_mapping(map: &mut HashMap<&str, String>) {
map.entry("hello").or_insert_with(String::new).push_str("wow")
}
Turning back to the integer form, what happens if we don't put the dereference?
fn update_count(map: &mut HashMap<&str, i32>) {
map.entry("hello").or_insert(0) += 1;
}
error[E0368]: binary assignment operation `+=` cannot be applied to type `&mut i32`
--> src/main.rs:4:5
|
4 | map.entry("hello").or_insert(0) += 1;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot use `+=` on type `&mut i32`
error[E0067]: invalid left-hand side expression
--> src/main.rs:4:5
|
4 | map.entry("hello").or_insert(0) += 1;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ invalid expression for left-hand side
The difference is that the += operator automatically takes a mutable reference to the left-hand side of the expression. Expanded, it might look something like this:
use std::ops::AddAssign;
fn update_count(map: &mut HashMap<&str, i32>) {
AddAssign::add_assign(&mut map.entry("hello").or_insert(0), 1);
}
Adding the explicit dereference brings the types back to one that has the trait implemented:
use std::ops::AddAssign;
fn update_count(map: &mut HashMap<&str, i32>) {
AddAssign::add_assign(&mut (*map.entry("hello").or_insert(0)), 1);
}

Resources