I'm just getting started with Rust and have hit upon an issue where I'm getting a bit lost in types, pointers, borrowing and ownership.
I have a vector of Person structs, each of which has a name property. I'd like to collect all the names into another vector, and then check if that list contains a given string.
This is my code:
struct Person {
name: String,
}
fn main() {
let people = vec![Person {
name: String::from("jack"),
}];
let names: Vec<_> = people.iter().map(|p| p.name).collect();
let some_name = String::from("bob");
if names.iter().any(|n| n == some_name) {
}
}
This fails to compile:
error[E0277]: can't compare `&String` with `String`
--> src/main.rs:13:31
|
13 | if names.iter().any(|n| n == some_name) {
| ^^ no implementation for `&String == String`
|
= help: the trait `PartialEq<String>` is not implemented for `&String`
I think names should have the type Vec<String>, so I'm a bit lost as to where the &String comes from. Is this a case of me holding it wrong? I'm most familiar with JavaScript so I'm trying to apply those patterns, and may be not doing it in the most "rusty" way!
Several things going on there:
&String comes from the fact that iter() iterates over the collection without consuming it, and therefore can only give you references into items of the collection.
You can address the inability to compare &String to String by borrowing the latter, i.e. changing n == some_name to n == &some_name.
After you do that, the compiler will tell you that p.name attempts to move name out of the reference p. To get rid of it, you can change p.name to &p.name, thus creating a Vec<&String>.
...after which n will become &&String (!), so you'll need n == &&some_name. The example then compiles.
The type of p is not Person but &Person, i.e. a reference to a value. In JavaScript there are no structs. The closest alternative is an object and JS objects are always stored in the heap and are accessed via references (although that's less explicit). In Rust you really need to think more about the memory layout of your data. That can complicate parts of your code, but it also gives you power and allows for better performance.
Here is a correct version of your code:
struct Person {
name: String,
}
fn main() {
let people = vec![Person {
name: String::from("jack"),
}];
let names: Vec<_> = people.iter().map(|p| &p.name).cloned().collect();
let some_name = String::from("bob");
if names.iter().any(|n| n == &some_name) {
}
}
Notice that in n == &some_name we are now comparing a &String with a &String.
Also note I added .cloned(), so that you avoid another error, related to moving the name strings.
A slightly more optimized version would be to only collect the references to the names, like this:
let names: Vec<_> = people.iter().map(|p| &p.name).collect();
let some_name = String::from("bob");
if names.iter().any(|n| *n == &some_name) {
}
Note that in this version n is of type &&String, so we dereference it once via *n. That way we are again comparing a &String with a &String. Rust also allows you to write something like this: |&n| n == &some_name, which is practically the same.
Because Rust does not impl PartialEq<&String> for String by default, here are impl PartialEq<&String> for &String and impl PartialEq<String> for String.
Here the type of n is &String, the type of some_name is String, so you need to balance the references on the left and right side of ==.
Both of the following fixes work. I prefer the first one, because the second one has a dereference which implies extra overhead for me. I'm not sure if Rust explicitly defines that there won't be overhead here, and even if there is, LLVM can usually optimize it.
if names.iter().any(|n| n == &some_name) // &String == &String
// or
if names.iter().any(|n| *n == some_name) // String == String
Then you will encounter another error:
error[E0507]: cannot move out of `p.name` which is behind a shared reference
--> src/main.rs:10:47
|
10 | let names: Vec<_> = people.iter().map(|p| p.name).collect();
| ^^^^^^ move occurs because `p.name` has type `String`, which does not implement the `Copy` trait
For more information about this error, try `rustc --explain E0507`.
|p| p.name tries to move the name out of p, which is not necessary for the later checks, a &str is enough here. Returns the &str of a String via String::as_str: |p| p.name.as_str()
Because of this fix, the type of n below becomes &&str, you can use names.into_iter() to get an iterator for ownership of names.
Fixed code:
struct Person {
name: String,
}
fn main() {
let people = vec![Person {
name: String::from("jack"),
}];
let names: Vec<_> = people.iter().map(|p| p.name.as_str()).collect();
let some_name = String::from("bob");
if names.into_iter().any(|n| n == some_name) {
}
}
Related
I have some code like this
struct MyStruct {
id: u32
}
fn main() {
let vec: Vec<MyStruct> = vec![MyStruct {
id: 1
}];
let my_struct = vec[0];
}
I thought my_struct must be of reference type, but the compiler proved I was wrong:
error[E0507]: cannot move out of index of `Vec<MyStruct>`
--> src/main.rs:10:21
|
10 | let my_struct = vec[0];
| ^^^^^^ move occurs because value has type `MyStruct`, which does not implement the `Copy` trait
|
help: consider borrowing here
|
10 | let my_struct = &vec[0];
So I read the doc of trait Index and noticed the sentence below:
container[index] is actually syntactic sugar for *container.index(index)
It's counterintuitive. Why rust dereferences it? Does it mean that it's more friendly to types those impl Copy?
There are plenty of other questions asking about this (1 2 3) but none really address the why part.
Rust has made many design decisions work to make ownership clear. For example, you cannot pass a value, v, directly to a function f that expects a reference. It must be f(&v). There are places where this isn't always the case (macros like println! and auto-ref for method calls being prime exceptions), but many parts of the language follow this principle.
The behavior of the index operator is pretty much the same as the normal field access operator .. If you have my_struct: &MyStruct then my_struct.id will yield a u32 and not a &u32. The default behavior is to move fields. You have to introduce a & (i.e. &my_struct.id) to get a reference to the field. So the same is with the index operator. If you want to make it clear that you want only a reference to the element, then you need to introduce a &.
I'm trying to manipulate a string derived from a function parameter and then return the result of that manipulation:
fn main() {
let a: [u8; 3] = [0, 1, 2];
for i in a.iter() {
println!("{}", choose("abc", *i));
}
}
fn choose(s: &str, pad: u8) -> String {
let c = match pad {
0 => ["000000000000000", s].join("")[s.len()..],
1 => [s, "000000000000000"].join("")[..16],
_ => ["00", s, "0000000000000"].join("")[..16],
};
c.to_string()
}
On building, I get this error:
error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
--> src\main.rs:9:9
|
9 | let c = match pad {
| ^ `str` does not have a constant size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `str`
= note: all local variables must have a statically known size
What's wrong here, and what's the simplest way to fix it?
TL;DR Don't use str, use &str. The reference is important.
The issue can be simplified to this:
fn main() {
let demo = "demo"[..];
}
You are attempting to slice a &str (but the same would happen for a String, &[T], Vec<T>, etc.), but have not taken a reference to the result. This means that the type of demo would be str. To fix it, add an &:
let demo = &"demo"[..];
In your broader example, you are also running into the fact that you are creating an allocated String inside of the match statement (via join) and then attempting to return a reference to it. This is disallowed because the String will be dropped at the end of the match, invalidating any references. In another language, this could lead to memory unsafety.
One potential fix is to store the created String for the duration of the function, preventing its deallocation until after the new string is created:
fn choose(s: &str, pad: u8) -> String {
let tmp;
match pad {
0 => {
tmp = ["000000000000000", s].join("");
&tmp[s.len()..]
}
1 => {
tmp = [s, "000000000000000"].join("");
&tmp[..16]
}
_ => {
tmp = ["00", s, "0000000000000"].join("");
&tmp[..16]
}
}.to_string()
}
Editorially, there's probably more efficient ways of writing this function. The formatting machinery has options for padding strings. You might even be able to just truncate the string returned from join without creating a new one.
What it means is harder to explain succinctly. Rust has a number of types that are unsized. The most prevalent ones are str and [T]. Contrast these types to how you normally see them used: &str or &[T]. You might even see them as Box<str> or Arc<[T]>. The commonality is that they are always used behind a reference of some kind.
Because these types don't have a size, they cannot be stored in a variable on the stack — the compiler wouldn't know how much stack space to reserve for them! That's the essence of the error message.
See also:
What is the return type of the indexing operation?
Return local String as a slice (&str)
Why your first FizzBuzz implementation may not work
Here is the exercise on the exercism
I just wanted to learn functional way.
use std::collections::HashMap;
pub fn can_construct_note(magazine: &[&str], note: &[&str]) -> bool {
let mut words: HashMap<&str, i32> = HashMap::new();
magazine.iter().map(|&w|
words.entry(w)
.and_modify(|e| *e += 1)
.or_insert(1)
);
println!("{:?}", words);
false
}
But I got this weird error and googled but I can't solve.
I understand that it can't be done by this way.
I want to know the correct way to do this.
Thanks.
error: captured variable cannot escape `FnMut` closure body
--> src/lib.rs:11:9
|
8 | let mut words: HashMap<&str, i32> = HashMap::new();
| --------- variable defined here
9 |
10 | let mut t = magazine.iter().map(|&w|
| - inferred to be a `FnMut` closure
11 | words.entry(w)
| ^----
| |
| _________variable captured here
| |
12 | | .and_modify(|e| *e += 1)
13 | | .or_insert(1)
| |_________________________^ returns a reference to a captured variable which escapes the closure body
|
= note: `FnMut` closures only have access to their captured variables while they are executing...
= note: ...therefore, they cannot allow references to captured variables to escape
error: aborting due to previous error
error: could not compile `magazine_cutout`
To learn more, run the command again with --verbose.
One problem with your code is that Iterator::map() is lazy and converts the iterator to another iterator without actually iterating over either. Because of that ending the expression with map() is not very useful, you need to do something that will exhaust the iterator. If you want to do it the functional way, you probably want for_each().
The other problem is that Entry::or_insert() returns a mutable reference to the inserted/retrieved value. This is normally used to chain operations that modify the value, such as or_insert(vec![]).push(item). Your closure doesn't end with ;, so its return value is the reference returned by or_insert(). Rust interprets such map() invocation as intending to transform the iterator over words to an iterator over mutable references to their counts. You would then be free to do whatever you want with those references, perhaps collect them in a vector. This is of course a big problem, as you're not allowed to have more than one mutable reference to anything inside the hashmap at once. This is why the borrow checker complains of the reference leaking out of the closure. To fix this, just add the braces and use a ;, so the closure returns () (which is incidentally the only valid return type of for_each()).
This compiles:
use std::collections::HashMap;
pub fn can_construct_note(magazine: &[&str], note: &[&str]) -> bool {
let mut words: HashMap<&str, i32> = HashMap::new();
magazine.iter().for_each(|&w| {
words.entry(w).and_modify(|e| *e += 1).or_insert(1);
});
println!("{:?}", words);
false
}
Playground
As others pointed out in the comments, an even more functional approach would be to use Iterator::fold(), which wouldn't require a mutating capture of the hashmap.
Functional programming is a way of approaching problems. It's not about syntax. Using map to modify an external mutable HashMap isn't functional programming. It's just an abuse of map (mapping should have no side-effects). There is nothing functional at all about for_each. It's just another syntax for for...in (and IMO an inferior syntax in most cases).
Broadly, functional programming avoids mutation, and thinking about problems recursively rather than by looping. Ömer Erden's solution is good in that it encapsulates the mutation inside the fold, but it's still basically a fancy loop. There's not a lot "functional" about that.
A functional way to think about this problem is recursively. Sort the words in both lists. Then the core mechanic is: Look at the first word in each list. If they match, recurse on the next word in each list. If they don't, recurse on the same goal list, and the next word on the source list. If the goal list is empty: success. If the source list is empty: fail. Notice that there was never a "count" step in there, and there's no HashMap. So I'm skipping your direct question and focusing on solving the full problem (since you said you wanted to explore functional approaches).
The first step towards that is to sort the words. There's no non-mutating sorted method in std, but there is one in IterTools. Still, I can make a simple (and extremely sloppy and special-case) one.
fn sorted<'a>(items: &[&'a str]) -> Vec<&'a str> {
let mut v = Vec::from(items);
v.sort();
v
}
Note that this function is not internally "functional." But it provides a functional interface. This is very common in FP. Functional approaches are sometimes slower than imperative/mutating approaches. We can keep the value of FP while improving performance by encapsulating mutation.
But with that in place, I can build a fully functional solution. Notice the lack of any mut.
pub fn can_construct_note(magazine: &[&str], note: &[&str]) -> bool {
// source and goal are sorted
fn f(source: &[&str], goal: &[&str]) -> bool {
// Split the source and goal into their heads and tails
// Consider just the first elements.
match (source.split_first(), goal.split_first()) {
(_, None) => true, // Can make nothing from anything
(None, Some(_)) => false, // Can't make something from nothing
(Some((s, s_rest)), Some((g, g_rest))) => {
match s.cmp(g) {
// If they match, then recurse on the tails
Ordering::Equal => f(s_rest, g_rest),
// If source < goal, they may match eventually, recurse with the next source element
Ordering::Less => f(s_rest, goal),
// If source > goal, it'll never work
Ordering::Greater => false,
}
}
}
}
// Sort the initial lists
let source = sorted(magazine);
let goal = sorted(note);
// And kick it off and return its result
f(&source[..], &goal[..])
}
This is a very functional way to solve the problem, to the point of being a text-book example. But notice there's not a single map, reduce, fold, or filter anywhere. Those are really important tools in functional programming, but they're not what it means to be functional.
It's not really great Rust. If these lists are very long, then this will likely crash the stack because Rust does not have tail-call optimization (which is a critical optimization for recursion to be really workable).
Recursion can always be turned into a loop, however. So at the cost of a small amount of visible mutation, this can be rewritten. Rather than recursively calling f(...), this changes source and goal and loops.
pub fn can_construct_note(magazine: &[&str], note: &[&str]) -> bool {
let mut source = &sorted(magazine)[..];
let mut goal = &sorted(note)[..];
// source and goal are sorted
loop {
// Split the source and goal into their heads and tails
match (source.split_first(), goal.split_first()) {
(_, None) => return true, // Can make nothing from anything
(None, Some(_)) => return false, // Can't make something from nothing
(Some((s, s_rest)), Some((g, g_rest))) => {
match s.cmp(g) {
// If they match, then recurse on the tails
Ordering::Equal => {
source = s_rest;
goal = g_rest;
continue;
}
// If source < goal, they may match eventually, recurse with the next source element
Ordering::Less => {
source = s_rest;
continue;
}
// If source > goal, it'll never work
Ordering::Greater => return false,
}
}
}
}
}
To Ömer's comments below, this is how you would create the HashMap itself in a functional way. This requires +nightly for the GroupBy.
#![feature(slice_group_by)]
use std::iter::FromIterator;
fn word_count<'a>(strings: &[&'a str]) -> HashMap<&'a str, usize> {
let sorted_strings = sorted(strings);
let groups = sorted_strings
.group_by(|a, b| a == b)
.map(|g| (g[0], g.len()));
HashMap::from_iter(groups)
}
I'm not worried about careful lifetime management here. I'm just focused on how to think in FP. This approach sorts the strings, then groups the strings by equality, and then maps those groups into a tuple of "the string" and "the number of copies." That list of tuples is then turned into a HashMap. There's no need for any mutable variables.
If you want really functional way you should do this:
use std::collections::HashMap;
fn main() {
let s = "aasasdasdasdasdasdasdasdfesrewr";
let map = s.chars().fold(HashMap::new(), |mut acc, c| {
acc.entry(c).and_modify(|x| *x += 1).or_insert(1i32);
acc
});
println!("{:?}", map);
}
I tried to use a String vector inside another vector:
let example: Vec<Vec<String>> = Vec::new();
for _number in 1..10 {
let mut temp: Vec<String> = Vec::new();
example.push(temp);
}
I should have 10 empty String vectors inside my vector, but:
example.get(0).push(String::from("test"));
fails with
error[E0599]: no method named `push` found for type `std::option::Option<&std::vec::Vec<std::string::String>>` in the current scope
--> src/main.rs:9:20
|
9 | example.get(0).push(String::from("test"));
| ^^^^
Why does it fail? Is it even possible to have an vector "inception"?
I highly recommend reading the documentation of types and methods before you use them. At the very least, look at the function's signature. For slice::get:
pub fn get<I>(&self, index: I) -> Option<&<I as SliceIndex<[T]>>::Output>
where
I: SliceIndex<[T]>,
While there's some generics happening here, the important part is that the return type is an Option. An Option<Vec> is not a Vec.
Refer back to The Rust Programming Language's chapter on enums for more information about enums, including Option and Result. If you wish to continue using the semantics of get, you will need to:
Switch to get_mut as you want to mutate the inner vector.
Make example mutable.
Handle the case where the indexed value is missing. Here I use an if let.
let mut example: Vec<_> = std::iter::repeat_with(Vec::new).take(10).collect();
if let Some(v) = example.get_mut(0) {
v.push(String::from("test"));
}
If you want to kill the program if the value is not present at the index, the shortest thing is to use the index syntax []:
example[0].push(String::from("test"));
I'm trying to manipulate a string derived from a function parameter and then return the result of that manipulation:
fn main() {
let a: [u8; 3] = [0, 1, 2];
for i in a.iter() {
println!("{}", choose("abc", *i));
}
}
fn choose(s: &str, pad: u8) -> String {
let c = match pad {
0 => ["000000000000000", s].join("")[s.len()..],
1 => [s, "000000000000000"].join("")[..16],
_ => ["00", s, "0000000000000"].join("")[..16],
};
c.to_string()
}
On building, I get this error:
error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
--> src\main.rs:9:9
|
9 | let c = match pad {
| ^ `str` does not have a constant size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `str`
= note: all local variables must have a statically known size
What's wrong here, and what's the simplest way to fix it?
TL;DR Don't use str, use &str. The reference is important.
The issue can be simplified to this:
fn main() {
let demo = "demo"[..];
}
You are attempting to slice a &str (but the same would happen for a String, &[T], Vec<T>, etc.), but have not taken a reference to the result. This means that the type of demo would be str. To fix it, add an &:
let demo = &"demo"[..];
In your broader example, you are also running into the fact that you are creating an allocated String inside of the match statement (via join) and then attempting to return a reference to it. This is disallowed because the String will be dropped at the end of the match, invalidating any references. In another language, this could lead to memory unsafety.
One potential fix is to store the created String for the duration of the function, preventing its deallocation until after the new string is created:
fn choose(s: &str, pad: u8) -> String {
let tmp;
match pad {
0 => {
tmp = ["000000000000000", s].join("");
&tmp[s.len()..]
}
1 => {
tmp = [s, "000000000000000"].join("");
&tmp[..16]
}
_ => {
tmp = ["00", s, "0000000000000"].join("");
&tmp[..16]
}
}.to_string()
}
Editorially, there's probably more efficient ways of writing this function. The formatting machinery has options for padding strings. You might even be able to just truncate the string returned from join without creating a new one.
What it means is harder to explain succinctly. Rust has a number of types that are unsized. The most prevalent ones are str and [T]. Contrast these types to how you normally see them used: &str or &[T]. You might even see them as Box<str> or Arc<[T]>. The commonality is that they are always used behind a reference of some kind.
Because these types don't have a size, they cannot be stored in a variable on the stack — the compiler wouldn't know how much stack space to reserve for them! That's the essence of the error message.
See also:
What is the return type of the indexing operation?
Return local String as a slice (&str)
Why your first FizzBuzz implementation may not work