Iterate over Vec<"CustomStruct"> => unsatisfied trait bounds - rust

I'm participating in this year's Advent of Code and wanted to take the opportunity to learn Rust. (So, if you're also participating, the following section might spoil something).
I want to iterate over the Vec vector and decrement the internal_counter value for each Item in this Vector. I tried the following:
let test: Vec<Lanternfish> = fish_list.map(|fish| fish.decrement_couner()).collect();
The compiler gives me the following error: method cannot be called on Vec<Lanternfish> due to unsatisfied trait bounds
I understand that the iterator function is not available for this, however I don't understand exactly how to fix the problem.
#[derive(Debug)]
struct Lanternfish {
internal_counter: u8,
}
impl Lanternfish {
fn new() -> Self {
Lanternfish {
internal_counter: 8,
}
}
fn decrement_counter(&mut self) {
self.internal_counter -= 1
}
}
fn part_one(content: &str) {
let content: Vec<char> = content.chars().filter(|char| char.is_digit(10)).collect();
let mut fish_list: Vec<Lanternfish> = init_list(content);
let test: Vec<Lanternfish> = fish_list.map(|fish| fish.decrement_counter()).collect();
}
fn init_list(initial_values: Vec<char>) -> Vec<Lanternfish> {
let mut all_lanternfish: Vec<_> = Vec::new();
for value in initial_values {
all_lanternfish.push(Lanternfish{internal_counter: value as u8});
}
all_lanternfish
}

The way to iterate over a Vec and call a mutating function on each element is:
for fish in &mut fish_list {
fish.decrement_counter();
}
What this line is doing:
fish_list.map(|fish| fish.decrement_couner).collect();
is
Try to call map on the Vec (it doesn't have that. Iterator has it, but you'd need to call iter(), iter_mut() or into_iter() on the Vec for that).
Assuming you get the right map, it then calls the lambda |fish| fish.decrement_couner on each element; typo aside, this is not a function call, but a field access, and Lanternfish doesn't have a field called decrement_couner. A call would need parentheses.
Assuming you fix the function call, you then collect all the results of the calls (a bunch of () "unit" values, since decrement_counter doesn't return anything) into a new Vec, which is of type Vec<()>.
And finally, you try to bind that to a variable of Vec<Lanternfish>, which will fail.
Meanwhile, the function calls will have modified the original Vec, if you used iter_mut(). Otherwise, the function calls will not compile.

Related

Why does std::iter::Peekable::peek mutably borrow the self argument?

I am struggling to understand why the peek() method is borrowing the self argument mutably.
The documentation says:
"Returns a reference to the next() value without advancing the iterator."
Since it is not advancing the iterator, what is the point behind borrowing the argument as mutable?
I looked at the implementation of peek() and noticed it is calling a next() method.
#[inline]
#[stable(feature = "rust1", since = "1.0.0")]
pub fn peek(&mut self) -> Option<&I::Item> {
let iter = &mut self.iter;
self.peeked.get_or_insert_with(|| iter.next()).as_ref()
}
Is it because of the use of the next() method, the peek() method is designed to borrow mutably or is there another semantic behind the peek() method that really requires the mutable borrow?
In other words, what is it that gets mutated when the peek() method is called?
As you have already done, let's look at its source, which reveals a little about how it works internally:
pub struct Peekable<I: Iterator> {
iter: I,
/// Remember a peeked value, even if it was None.
peeked: Option<Option<I::Item>>,
}
Together with its implementation for next():
impl<I: Iterator> Iterator for Peekable<I> {
// ...
fn next(&mut self) -> Option<I::Item> {
match self.peeked.take() {
Some(v) => v,
None => self.iter.next(),
}
}
// ...
}
and it's implementation for peek():
impl<I: Iterator> Peekable<I> {
// ...
pub fn peek(&mut self) -> Option<&I::Item> {
let iter = &mut self.iter;
self.peeked.get_or_insert_with(|| iter.next()).as_ref()
}
// ...
}
Peek wraps an existing iterator. And existing iterators are not peekable.
So what peek does, is:
on peek():
take the next() item from the wrapped iterator and store it in self.peeked (if self.peeked does not yet contain the next item already)
return a reference to the peeked item
on next():
see if we currently have a self.peeked item
if yes, return that one
if no, take the next() item from the underlaying iterator.
So as you already realized, the peek() action needs &mut self because it might have to generate the next peeked item by calling next() on the underlying iterator.
So here is the reason, if you look at it from a more abstract point of view: The next item might not even exist yet. So peeking might involve actually generating that next item, which is definitely a mutating action on the underlying iterator.
Not all iterators are over arrays/slices where the items already exist; an iterator might by anything that generates a number of items, including lazy generators that only create said items as they are asked for it.
Could they have implemented it differently?
Yes, there absolutely is the possibility to do it differently. They could have next()ed the underlying iterator during new(). Then, when someone calls next() on the Peekable, it could return the currently peeked value and query the next one right away. Then, peeking would have been a &self method.
Why they went that way is unclear, but most certainly to keep the iterator as lazy as possible. Lazy iterators are a good thing in most cases.
That said, here is a proof of concept how a prefetching peekable iterator could be implemented that doesn't require &mut for peek():
pub struct PrefetchingPeekingIterator<I: Iterator> {
iter: I,
next_item: Option<I::Item>,
}
impl<I: Iterator> PrefetchingPeekingIterator<I> {
fn new(mut iter: I) -> Self {
let next_item = iter.next();
Self { iter, next_item }
}
fn peek(&self) -> Option<&I::Item> {
self.next_item.as_ref()
}
}
impl<I: Iterator> Iterator for PrefetchingPeekingIterator<I> {
type Item = I::Item;
fn next(&mut self) -> Option<Self::Item> {
std::mem::replace(&mut self.next_item, self.iter.next())
}
}
fn main() {
let mut range = PrefetchingPeekingIterator::new(1..10);
dbg!(range.next().unwrap());
dbg!(range.peek().unwrap());
dbg!(range.next().unwrap());
dbg!(range.peek().unwrap());
dbg!(range.next().unwrap());
dbg!(range.peek().unwrap());
}
[src/main.rs:27] range.next().unwrap() = 1
[src/main.rs:28] range.peek().unwrap() = 2
[src/main.rs:29] range.next().unwrap() = 2
[src/main.rs:30] range.peek().unwrap() = 3
[src/main.rs:31] range.next().unwrap() = 3
[src/main.rs:32] range.peek().unwrap() = 4
Yes as you noticed Peekable::peek might have to call self.iter.next() to get an element if its self.peeked doesn't already have something stored.
It then also has to store that value somewhere to not advance the Peekable iterator.
The underlying Iterator may very well get advanced by it.
Both advancing the underlying Iterator as well as storing the value in self.peeked require mutable access to self.

Rust: how to assign `iter().map()` or `iter().enumarate()` to same variable

struct A {...whatever...};
const MY_CONST_USIZE:usize = 127;
// somewhere in function
// vec1_of_A:Vec<A> vec2_of_A_refs:Vec<&A> have values from different data sources and have different inside_item types
let my_iterator;
if my_rand_condition() { // my_rand_condition is random and compiles for sake of simplicity
my_iterator = vec1_of_A.iter().map(|x| (MY_CONST_USIZE, &x)); // Map<Iter<Vec<A>>>
} else {
my_iterator = vec2_of_A_refs.iter().enumerate(); // Enumerate<Iter<Vec<&A>>>
}
how to make this code compile?
at the end (based on condition) I would like to have iterator able build from both inputs and I don't know how to integrate these Map and Enumerate types into single variable without calling collect() to materialize iterator as Vec
reading material will be welcomed
In the vec_of_A case, first you need to replace &x with x in your map function. The code you have will never compile because the mapping closure tries to return a reference to one of its parameters, which is never allowed in Rust. To make the types match up, you need to dereference the &&A in the vec2_of_A_refs case to &A instead of trying to add a reference to the other.
Also, -127 is an invalid value for usize, so you need to pick a valid value, or use a different type than usize.
Having fixed those, now you need some type of dynamic dispatch. The simplest approach would be boxing into a Box<dyn Iterator>.
Here is a complete example:
#![allow(unused)]
#![allow(non_snake_case)]
struct A;
// Fixed to be a valid usize.
const MY_CONST_USIZE: usize = usize::MAX;
fn my_rand_condition() -> bool { todo!(); }
fn example() {
let vec1_of_A: Vec<A> = vec![];
let vec2_of_A_refs: Vec<&A> = vec![];
let my_iterator: Box<dyn Iterator<Item=(usize, &A)>>;
if my_rand_condition() {
// Fixed to return x instead of &x
my_iterator = Box::new(vec1_of_A.iter().map(|x| (MY_CONST_USIZE, x)));
} else {
// Added map to deref &&A to &A to make the types match
my_iterator = Box::new(vec2_of_A_refs.iter().map(|x| *x).enumerate());
}
for item in my_iterator {
// ...
}
}
(Playground)
Instead of a boxed trait object, you could also use the Either type from the either crate. This is an enum with Left and Right variants, but the Either type itself implements Iterator if both the left and right types also do, with the same type for the Item associated type. For example:
#![allow(unused)]
#![allow(non_snake_case)]
use either::Either;
struct A;
const MY_CONST_USIZE: usize = usize::MAX;
fn my_rand_condition() -> bool { todo!(); }
fn example() {
let vec1_of_A: Vec<A> = vec![];
let vec2_of_A_refs: Vec<&A> = vec![];
let my_iterator;
if my_rand_condition() {
my_iterator = Either::Left(vec1_of_A.iter().map(|x| (MY_CONST_USIZE, x)));
} else {
my_iterator = Either::Right(vec2_of_A_refs.iter().map(|x| *x).enumerate());
}
for item in my_iterator {
// ...
}
}
(Playground)
Why would you choose one approach over the other?
Pros of the Either approach:
It does not require a heap allocation to store the iterator.
It implements dynamic dispatch via match which is likely (but not guaranteed) to be faster than dynamic dispatch via vtable lookup.
Pros of the boxed trait object approach:
It does not depend on any external crates.
It scales easily to many different types of iterators; the Either approach quickly becomes unwieldy with more than two types.
You can do this using a Boxed trait object like so:
let my_iterator: Box<dyn Iterator<Item = _>> = if my_rand_condition() {
Box::new(vec1_of_A.iter().map(|x| (MY_CONST_USIZE, x)))
} else {
Box::new(vec2_of_A_refs.iter().enumerate().map(|(i, x)| (i, *x)))
};
I don't think this is a good idea generally though. A few things to note:
The use of trait objects means the types here must be resolved dynamically. This adds a lot of overhead.
The closure in vec1's iterator's map method cannot reference its arguments. Instead the second map must be added to vec2s iterator. The effect of this is that all the items are being copied regardless. If you are doing this, why not collect()? The overhead for creating the Vec or whatever you choose should be less than that of the dynamic resolution.
Bit pedantic, but remember if statements are expressions in Rust, and so the assignment can be expressed a little more cleanly as I have done above.

Get last element from vector

I have this simple piece of code:
fn main() {
let mut blockchain: Vec<blockchain::Block> = Vec::new();
let genesis_block = blockchain::create_block("genesis_block");
blockchain::add_block_to_blockchain(&mut blockchain, genesis_block);
}
My error occurs here:
pub fn get_last_block(blockchain: &Vec<Block>) -> Block {
return blockchain[blockchain.len() - 1];
}
It says:
I am pretty new to rust, so can somebody explain me why this wont work?
I just trying to get the last element of this vector.
Should i pass the ownership of this vector instead of borrowing it?
EDIT: This is my result now:
pub fn get_last_block(blockchain: &Vec<Block>) -> Option<&Block> {
return blockchain.last();
}
blockchain could be empty. I check with is_some if its returning an value
let block = blockchain::get_last_block(&blockchain);
if block.is_some() {
blockchain::print_block(block.unwrap());
}
Since you are borrowing the vector, you can either:
return a reference to the block
clone the block
pop the block from the vec and return it (you would need to mutably borrow it instead, &mut)
Also, consider using an Option as return type, in case your vector is empty. By using this, you could directly call to last for example, this would return a reference & to the last Block:
pub fn get_last_block(blockchain: &Vec<Block>) -> Option<&Block> {
blockchain.last()
}
Nitpick, you could use a slice instead of a Vec in the function signature:
fn get_last_block(blockchain: &[Block])...

How do I avoid incurring in lifetime issues when refactoring a function?

Playground if you want to jump directly into the code.
Problem
I'm trying to implement a function filter_con<T, F>(v: Vec<T>, predicate: F) that allows concurrent filter on a Vec, via async predicates.
That is, instead of doing:
let arr = vec![...];
let arr_filtered = join_all(arr.into_iter().map(|it| async move {
if some_getter(&it).await > some_value {
Some(it)
} else {
None
}
}))
.await
.into_iter()
.filter_map(|it| it)
.collect::<Vec<T>>()
every time I need to filter for a Vec, I want to be able to:
let arr = vec![...];
let arr_filtered = filter_con(arr, |it| async move {
some_getter(&it).await > some_value
}).await
Tentative implementation
I've extracted the function into its own but I am incurring in lifetime issues
async fn filter_con<T, B, F>(arr: Vec<T>, predicate: F) -> Vec<T>
where
F: FnMut(&T) -> B,
B: futures::Future<Output = bool>,
{
join_all(arr.into_iter().map(|it| async move {
if predicate(&it).await {
Some(it)
} else {
None
}
}))
.await
.into_iter()
.filter_map(|p| p)
.collect::<Vec<_>>()
}
error[E0507]: cannot move out of a shared reference
I don't know what I'm moving out of predicate?
For more details, see the playground.
You won't be able to make the predicate an FnOnce, because, if you have 10 items in your Vec, you'll need to call the predicate 10 times, but an FnOnce only guarantees it can be called once, which could lead to something like this:
let vec = vec![1, 2, 3];
let has_drop_impl = String::from("hello");
filter_con(vec, |&i| async {
drop(has_drop_impl);
i < 5
}
So F must be either an FnMut or an Fn. The standard library Iterator::filter takes an FnMut, though this can be a source of confusion (it is the captured variables of the closure that need a mutable reference, not the elements of the iterator).
Because the predicate is an FnMut, any caller needs to be able to get an &mut F. For Iterator::filter, this can be used to do something like this:
let vec = vec![1, 2, 3];
let mut count = 0;
vec.into_iter().filter(|&x| {
count += 1; // this line makes the closure an `FnMut`
x < 2
})
However, by sending the iterator to join_all, you are essentially allowing your async runtime to schedule these calls as it wants, potentially at the same time, which would cause an aliased &mut T, which is always undefined behaviour. This issue has a slightly more cut down version of the same issue https://github.com/rust-lang/rust/issues/69446.
I'm still not 100% on the details, but it seems the compiler is being conservative here and doesn't even let you create the closure in the first place to prevent soundness issues.
I'd recommend making your function only accept Fns. This way, your runtime is free to call the function however it wants. This does means that your closure cannot have mutable state, but this is unlikely to be a problem in a tokio application. For the counting example, the "correct" solution is to use an AtomicUsize (or equivalent), which allows mutation via shared reference. If you're referencing mutable state in your filter call, it should be thread safe, and thread safe data structures generally allow mutation via shared reference.
Given that restriction, the following gives the answer you expect:
async fn filter_con<T, B, F>(arr: Vec<T>, predicate: F) -> Vec<T>
where
F: Fn(&T) -> B,
B: Future<Output = bool>,
{
join_all(arr.into_iter().map(|it| async {
if predicate(&it).await {
Some(it)
} else {
None
}
}))
.await
.into_iter()
.filter_map(|p| p)
.collect::<Vec<_>>()
}
Playground

Is using `ref` in a function argument the same as automatically taking a reference?

Rust tutorials often advocate passing an argument by reference:
fn my_func(x: &Something)
This makes it necessary to explicitly take a reference of the value at the call site:
my_func(&my_value).
It is possible to use the ref keyword usually used in pattern matching:
fn my_func(ref x: Something)
I can call this by doing
my_func(my_value)
Memory-wise, does this work like I expect or does it copy my_value on the stack before calling my_func and then get a reference to the copy?
The value is copied, and the copy is then referenced.
fn f(ref mut x: i32) {
*x = 12;
}
fn main() {
let mut x = 42;
f(x);
println!("{}", x);
}
Output: 42
Both functions declare x to be &Something. The difference is that the former takes a reference as the parameter, while the latter expects it to be a regular stack value. To illustrate:
#[derive(Debug)]
struct Something;
fn by_reference(x: &Something) {
println!("{:?}", x); // prints "&Something""
}
fn on_the_stack(ref x: Something) {
println!("{:?}", x); // prints "&Something""
}
fn main() {
let value_on_the_stack: Something = Something;
let owned: Box<Something> = Box::new(Something);
let borrowed: &Something = &value_on_the_stack;
// Compiles:
on_the_stack(value_on_the_stack);
// Fail to compile:
// on_the_stack(owned);
// on_the_stack(borrowed);
// Dereferencing will do:
on_the_stack(*owned);
on_the_stack(*borrowed);
// Compiles:
by_reference(owned); // Does not compile in Rust 1.0 - editor
by_reference(borrowed);
// Fails to compile:
// by_reference(value_on_the_stack);
// Taking a reference will do:
by_reference(&value_on_the_stack);
}
Since on_the_stack takes a value, it gets copied, then the copy matches against the pattern in the formal parameter (ref x in your example). The match binds x to the reference to the copied value.
If you call a function like f(x) then x is always passed by value.
fn f(ref x: i32) {
// ...
}
is equivalent to
fn f(tmp: i32) {
let ref x = tmp;
// or,
let x = &tmp;
// ...
}
i.e. the referencing is completely restricted to the function call.
The difference between your two functions becomes much more pronounced and obvious if the value doesn't implement Copy. For example, a Vec<T> doesn't implement Copy, because that is an expensive operation, instead, it implements Clone (Which requires a specific method call).
Assume two methods are defined as such
fn take_ref(ref v: Vec<String>) {}// Takes a reference, ish
fn take_addr(v: &Vec<String>) {}// Takes an explicit reference
take_ref will try to copy the value passed, before referencing it. For Vec<T>, this is actually a move operation (Because it doesn't copy). This actually consumes the vector, meaning the following code would throw a compiler error:
let v: Vec<String>; // assume a real value
take_ref(v);// Value is moved here
println!("{:?}", v);// Error, v was moved on the previous line
However, when the reference is explicit, as in take_addr, the Vec isn't moved but passed by reference. Therefore, this code does work as intended:
let v: Vec<String>; // assume a real value
take_addr(&v);
println!("{:?}", v);// Prints contents as you would expect

Resources