Doing more than 1 thing in a iter().map()

Doing more than 1 thing in a iter().map() - rust

I would like to use a map to create a new vector, but at the same time, do other things inside that map. I'm working on Advent of Code 2021, day 6 part 1.
This code loops through a vector and decrements all the values by one. If the value is at 0, then it resets that position to 6 and adds an 8 to the end of the vector.
fn run_growth_simulation(mut state: Vec<u8>, days: i32) -> usize {
for _day in 0..days {
let mut new_fish = 0;
state.iter_mut().map(|x| match x {
num: u8 # 1..=8 => {num - 1},
0 => {new_fish += 1; 6},
_ => unreachable!()
})
for _fish in 0..new_fish {
state.push(8);
}
}
state.iter().count() as usize
}
How do I return the right item from the closure?

I would mutate the value in the iterator directly and not build a new array, because of that use for_each instead of map (or preferable directly a for loop).
Then inside the match statement mutate the value:
state.iter_mut().for_each(|x| match x {
//: u8 removed because it gave me an syntax error
// mutate the number directly (we have to use `num` because x was moved)
num # 1..=8 => {*num -= 1;},
// mutate the number
0 => {new_fish += 1; *x = 6;},
_ => unreachable!()
});
A slightly different approach would be to count the 0s in the vector, remove them, subtract each value by 1 and add the new fish

As a complement to the answer stating that for_each() is preferable to map() here (since we don't consume what map() emits), below is a simpler example trying to illustrate the problem (and why the borrow-checker is right when it forbids such attempts).
In both cases (test1() and test2()) we are iterating over a vector while we are extending it (this is what was intended in the question).
In test1() the iterator considers the storage for the values once for all when it is created.
For all the subsequent iterations, it will refer to this initial storage, thus this storage must not move elsewhere in memory in the meantime.
That's why the iterator borrows the vector (mutably or not, this is not important here).
However, during these iterations we try to append new values to this vector: this may move the storage (for reallocation purpose) and fortunately this requires a mutable borrow of the vector (then it's rejected).
In test2() we avoid keeping a reference to the initial storage, and use a counter instead.
This works, but this is suboptimal since at each iteration this index operation ([]) needs to check the bounds.
The iterator in the previous function knows the bounds one for all; that's why iterators lead to better optimisation opportunities by the compiler.
Note that len() is evaluated once for all at the beginning of the loop here; this is probably what we want, but if we wanted to reevaluate it at each iteration, then we would have to use a loop {} instruction.
What is discussed here is not specific to the language but to the problem itself.
With a more permissive programming language, the first attempt may have been allowed but would have lead to memory errors; or such language should shift systematically towards the second attempt and pay the cost of bound checking at each iteration.
In the end, your solution with a second loop is probably the best choice.
fn test1() {
let mut v = vec![1, 2, 3, 4, 5, 6, 7, 8];
v.iter_mut().for_each(|e| {
if *e <= 3 {
let n = *e + 100;
// v.push(n) // !!! INCORRECT !!!
// we are trying to reallocate the storage while iterating over it
} else {
*e += 10;
}
});
println!("{:?}", v);
}
fn test2() {
let mut v = vec![1, 2, 3, 4, 5, 6, 7, 8];
for i in 0..v.len() {
let e = &mut v[i];
if *e <= 3 {
let n = *e + 100;
v.push(n);
} else {
*e += 10;
}
}
println!("{:?}", v);
}
fn main() {
test1(); // [1, 2, 3, 14, 15, 16, 17, 18]
test2(); // [1, 2, 3, 14, 15, 16, 17, 18, 101, 102, 103]
}

Related

Given `Vec<HashSet>`, how to update `v[i]` while iterating `v[i - 1]`? [duplicate]

This question already has answers here:
How to get mutable references to two array elements at the same time?
(8 answers)
Closed 2 months ago.
Let v be Vec<HashSet<usize>>.
Is it possible to update v[i] while iterating v[i - 1]?
Normally, Rust's ownership rule doesn't allow this, but I believe some way should exist since v[i] and v[i - 1] are essentially independent.
unsafe is allowed because unsafe sometimes lets us bypass (in a sense) the ownership rule. (For example, swapping values of HashMap is normally impossible, but using unsafe makes it possible. ref: swapping two entries of a HashMap)
Please assume v.len() is very large because, if v.len() is small, you can give up using Vec container in the first place.
Very artificial but minimum working example is shown below (Rust Playground). This type of source code is often seen in doing dynamic programming.
use std::collections::HashSet;
fn main() {
let n = 100000;
let mut v = vec![HashSet::new(); n];
v[0].insert(0);
for i in 1..n {
v[i] = v[i - 1].clone(); //This `clone()` is necessarily.
let prev = v[i - 1].clone(); //I want to eliminate this `clone()`.
prev.iter().for_each(|e| {
v[i].insert(*e + 1);
})
}
println!("{:?}", v); //=> [{0}, {0, 1}, {0, 1, 2}, {0, 1, 2, 3}, ...]
}

When you modify a vector with v[i], you are using the IndexMut trait, which requires a mutable borrow to Self, ie. the whole vector. For this reason, Rust will never allow taking v[i] and v[i-1] at the same time, if at least one of them is a mutable borrow.
To solve this issue, you must work a little harder to make Rust understand v[i] and v[i-1] are not aliased (because, in the end, all the borrow checking stuff ends up in LLVM being able to tell if something is aliased, or not).
The "bad" news is that it's impossible to do so without relying on unsafe somewhere. The good news is that someone else already did that, and wrapped it in a safe interface, namely split_at_mut. This will break a single vector into two subslices, which are guaranteed to be disjoint (this is where unsafe kicks in).
So, for instance, in your case, you could do
use std::collections::HashSet;
fn main() {
let n = 100000;
let mut v = vec![HashSet::new(); n];
v[0].insert(0);
for i in 1..n {
v[i] = v[i - 1].clone(); //This `clone()` is necessarily.
let (left, right) = v.split_at_mut(i);
left[i-1].iter().for_each(|e| {
right[0].insert(*e + 1);
})
}
println!("{:?}", v); //=> [{0}, {0, 1}, {0, 1, 2}, {0, 1, 2, 3}, ...]
}
See the playground.
Besides, maybe this is just because your example is simplified, but there is actually no point in creating 100000 HashMaps, if you just modify them right away. A simpler solution would be
use std::collections::HashSet;
fn main() {
let n = 100000;
let mut v = Vec::with_capacity(n);
v.insert(HashSet::from([0]));
for i in 0..n-1 {
let mut new_set = v[i].clone();
for e in v[i].iter().copied() {
new_set.insert(e+1);
}
v.push(new_set);
}
println!("{:?}", v);
}
See the playground.

parallel sorting on separate sections of a single slice

I'm trying to implement a sort of parallel bubble sort, e.g. have a number of threads work on distinct parts of the same slice and then have a final thread sort those two similar to a kind of merge sort
I have this code so far
pub fn parallel_bubble_sort(to_sort: Arc<&[i32]>) {
let midpoint = to_sort.len() / 2;
let ranges = [0..midpoint, midpoint..to_sort.len()];
let handles = (ranges).map(|range| {
thread::spawn(|| {
to_sort[range].sort();
})
});
}
But I get a series of errors, relating to 'to_sort's lifetime, etc
How would someone go about modifying distinct slices of a larger slice across thread bounds?

Disclaimer: I assume that you want to sort in place, as you call .sort().
There's a couple of problems with your code:
The to_sort isn't mutable, so you won't be able to modify it. Which is an essential part of sorting ;) So I think that Arc<&[i32]> should most certainly be &mut [i32].
You cannot split a mutable slice like this. Rust doesn't know if your ranges overlap, and therefore disallows this entirely. You can, however, use split_at to split it into two parts. This even works with mutable references, which is important in your case.
You cannot move mutable references to threads, because it's unknown how long the
thread will exists. Overcoming this issue is the hardest part, I'm afraid; I don't know how easy it is in normal Rust without the use of unsafe. I think the easiest solution would be to use a library like rayon which already solved those problems for you.
EDIT: Rust 1.63 introduces scoped threads, which eliminates the need for rayon in this usecase.
This should be a good start for you:
pub fn parallel_bubble_sort(to_sort: &mut [i32]) {
let midpoint = to_sort.len() / 2;
let (left, right) = to_sort.split_at_mut(midpoint);
std::thread::scope(|s| {
s.spawn(|| left.sort());
s.spawn(|| right.sort());
});
// TODO: merge left and right
}
fn main() {
let mut data = [1, 6, 3, 4, 9, 7, 4];
parallel_bubble_sort(&mut data);
println!("{:?}", data);
}
[1, 3, 6, 4, 4, 7, 9]
Previous answer for Rust versions older than 1.63
pub fn parallel_bubble_sort(to_sort: &mut [i32]) {
let midpoint = to_sort.len() / 2;
let (left, right) = to_sort.split_at_mut(midpoint);
rayon::scope(|s| {
s.spawn(|_| left.sort());
s.spawn(|_| right.sort());
});
// TODO: merge left and right
}
fn main() {
let mut data = [1, 6, 3, 4, 9, 7, 4];
parallel_bubble_sort(&mut data);
println!("{:?}", data);
}
[1, 3, 6, 4, 4, 7, 9]

How to interlace an iterator with itself from the end?

I have an iterator in the form 0..=63, i.e.
0 1 2 3 4 ... 59 60 61 62 63.
Its .count() is 64.
How would I get the following iterator:
0 63 1 62 2 61 3 60 4 59 ...
(of course independent of the items present in the iterator), preferably without cloning?
The .count() should stay the same, as only the order of items should change.
I've looked in the standard library and couldn't find it, same in the itertools crate.

Here's one way using only the standard library. It requires a DoubleEndedIterator and will skip the last item for odd sized iterators:
fn main() {
let mut range = (0..=63).into_iter();
let iter = std::iter::from_fn(|| Some([range.next()?, range.next_back()?])).flatten();
dbg!(iter.collect::<Vec<_>>());
}
Output:
[src/main.rs:4] iter.collect::<Vec<_>>() = [
0,
63,
1,
62,
2,
61,
3,
...
30,
33,
31,
32,
]
Playground
#Finomnis has posted a solution in case your input has an odd number of items.

This solution works for all iterators that implement DoubleEndedIterator:
Note that this solution is guaranteed to return all items, regardless of whether the iterator contains an even or odd number of them.
fn selfinterlace<Iter>(mut iter: Iter) -> impl Iterator<Item = Iter::Item>
where
Iter: DoubleEndedIterator,
{
let mut from_front = false;
std::iter::from_fn(move || {
from_front = !from_front;
if from_front {
iter.next()
} else {
iter.next_back()
}
})
}
fn main() {
let range = (0..=8).into_iter();
let iter = selfinterlace(range);
println!("{:?}", iter.collect::<Vec<_>>());
}
[0, 8, 1, 7, 2, 6, 3, 5, 4]
The idea is that you store whether the next item should be from the front or the back, and then flip that in every iteration.
from_fn can take FnMut, meaning, it can take closures that store internal state. The internal state in this closure consists of the variables iter and from_front, which get moved into the closure through the move || keyword.

You could use itertools::interleave to interleave forward and reverse iterators.
RangeInclusive<i32> doesn't implement ExactSizedIterator so there's no .len() function. We have to calculate it ourselves.
If the range has an odd length the extra item will show up in the forward half thanks to (len + 1).
let range = 0..=63;
let len = range.end() - range.start() + 1;
let iter = itertools::interleave(
range.clone().take((len + 1) / 2),
range.clone().rev().take(len / 2),
);
Playground

How to set a range in a Vec or slice?

My end goal is to shuffle the rows of a matrix (for which I am using nalgebra).
To address this I need to set a mutable range (slice) of an array.
Supposing I have an array as such (let's say it's a 3x3 matrix):
let mut scores = [7, 8, 9, 10, 11, 12, 13, 14, 15];
I have extracted a row like this:
let r = &scores[..].chunks(3).collect::<Vec<_>>()[1];
Now, for the knuth shuffle I need to swap this with another row. What I need to do is:
scores.chunks_mut(3)[0] = r;
however this fails as such:
cannot index a value of type `core::slice::ChunksMut<'_, _>`
Example: http://is.gd/ULkN6j

I ended up doing a loop over and an element by element swap which seems like a cleaner implementation to me:
fn swap_row<T>(matrix: &mut [T], row_src: usize, row_dest: usize, cols: usize){
for c in 0..cols {
matrix.swap(cols * row_src + c, cols * row_dest + c);
}
}

Your code, as you'd like to write it, can never work. You have an array that you are trying to read from and write to at the same time. This will cause you to have duplicated data:
[1, 2, 3, 4]
// Copy last two to first two
[3, 4, 3, 4]
// Copy first two to last two
[3, 4, 3, 4]
Rust will prevent you from having mutable and immutable references to the same thing for this very reason.
cannot index a value of type core::slice::ChunksMut<'_, _>
chunks_mut returns an iterator. The only thing that an iterator is guaranteed to do is return "the next thing". You cannot index it, it is not all available in contiguous memory.
To move things around, you are going to need somewhere temporary to store the data. One way is to copy the array:
let scores = [7, 8, 9, 10, 11, 12, 13, 14, 15];
let mut new_scores = scores;
for (old, new) in scores[0..3].iter().zip(new_scores[6..9].iter_mut()) {
*new = *old;
}
for (old, new) in scores[3..6].iter().zip(new_scores[0..3].iter_mut()) {
*new = *old;
}
for (old, new) in scores[6..9].iter().zip(new_scores[3..6].iter_mut()) {
*new = *old;
}
Then it's a matter of following one of these existing questions to copy from one to the other.

that's probably closer to what You wanted to do:
fn swap_row<T: Clone>(matrix: &mut [T], row_src: usize, row_dest: usize, cols: usize) {
let v = matrix[..].to_vec();
let mut chunks = v.chunks(cols).collect::<Vec<&[T]>>();
chunks.swap(row_src, row_dest);
matrix.clone_from_slice(chunks.into_iter().fold((&[]).to_vec(), |c1, c2| [c1, c2.to_vec()].concat()).as_slice());
}
I would prefer:
fn swap_row<T: Clone>(matrix: &[T], row_src: usize, row_dest: usize, cols: usize) -> Vec<T> {
let mut chunks = matrix[..].chunks(cols).collect::<Vec<&[T]>>();
chunks.swap(row_src, row_dest);
chunks.iter().fold((&[]).to_vec(), |c1, c2| [c1, c2.to_vec()].concat())
}
btw: nalgebra provides unsafe fn as_slice_unchecked(&self) -> &[T] for all kinds of Storage and RawStorage.
Shuffeling this slice avoids the need for row swapping.

How to remove an element from a vector given the element?

Is there a simple way to remove an element from a Vec<T>?
There's a method called remove(), and it takes an index: usize, but there isn't even an index_of() method that I can see.
I'm looking for something (hopefully) simple and O(n).

This is what I have come up so far (that also makes the borrow checker happy):
let index = xs.iter().position(|x| *x == some_x).unwrap();
xs.remove(index);
I'm still waiting to find a better way to do this as this is pretty ugly.
Note: my code assumes the element does exist (hence the .unwrap()).

You can use the retain method but it will delete every instance of the value:
fn main() {
let mut xs = vec![1, 2, 3];
let some_x = 2;
xs.retain(|&x| x != some_x);
println!("{:?}", xs); // prints [1, 3]
}

Your question is under-specified: do you want to return all items equal to your needle or just one? If one, the first or the last? And what if there is no single element equal to your needle? And can it be removed with the fast swap_remove or do you need the slower remove? To force programmers to think about those questions, there is no simple method to "remove an item" (see this discussion for more information).
Remove first element equal to needle
// Panic if no such element is found
vec.remove(vec.iter().position(|x| *x == needle).expect("needle not found"));
// Ignore if no such element is found
if let Some(pos) = vec.iter().position(|x| *x == needle) {
vec.remove(pos);
}
You can of course handle the None case however you like (panic and ignoring are not the only possibilities).
Remove last element equal to needle
Like the first element, but replace position with rposition.
Remove all elements equal to needle
vec.retain(|x| *x != needle);
... or with swap_remove
Remember that remove has a runtime of O(n) as all elements after the index need to be shifted. Vec::swap_remove has a runtime of O(1) as it swaps the to-be-removed element with the last one. If the order of elements is not important in your case, use swap_remove instead of remove!

There is a position() method for iterators which returns the index of the first element matching a predicate. Related question: Is there an equivalent of JavaScript's indexOf for Rust arrays?
And a code example:
fn main() {
let mut vec = vec![1, 2, 3, 4];
println!("Before: {:?}", vec);
let removed = vec.iter()
.position(|&n| n > 2)
.map(|e| vec.remove(e))
.is_some();
println!("Did we remove anything? {}", removed);
println!("After: {:?}", vec);
}

Is drain_filter() new from the last answers?
Seems similar to Kai's answer:
#![feature(drain_filter)]
let mut numbers = vec![1, 2, 3, 4, 5, 6, 8, 9, 11, 13, 14, 15];
numbers.drain_filter(|x| *x % 2 == 0).collect::<Vec<_>>();
assert_eq!(numbers, vec![1, 3, 5, 9, 11, 13, 15]);
https://doc.rust-lang.org/std/vec/struct.Vec.html#method.drain_filter

If your data is sorted, please use binary search for O(log n) removal, which could be much much faster for large inputs.
match values.binary_search(value) {
Ok(removal_index) => values.remove(removal_index),
Err(_) => {} // value not contained.
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Doing more than 1 thing in a iter().map() - rust

Related

Given `Vec<HashSet>`, how to update `v[i]` while iterating `v[i - 1]`? [duplicate]

parallel sorting on separate sections of a single slice

How to interlace an iterator with itself from the end?

How to set a range in a Vec or slice?

How to remove an element from a vector given the element?

Categories

Resources