Accessing an array while iterating over it mutably - rust

Disclaimer: I am fairly new to Rust.
Simplified Use-Case
From the best practices I read about Rust so far, I understood that iterating with for elem in array {} is preferred over to for i in 0..array.len(){}.
Is there any way to iterate over an array mutably, while simultaneously accessing specific elements from it by index?
My usecase is quite complex, so I wrote a simple fibonacci calculator to demonstrate the problem:
let mut arr = vec![0;10];
arr[0] = 1;
arr[1] = 1;
for (i, elem) in arr.iter_mut().skip(2).enumerate() {
*elem = arr[i-2] + arr[i-1];
}
println!("{:?}", arr);
error[E0502]: cannot borrow arr as immutable because it is also borrowed as mutable
Of course this makes sense, but is there a way around that? I mean, from a programmers perspective, it is obvious that this code is safe, because we borrow immutable from an array that we already have in the current context as mutable, just not directly, but through an iterator.
Of course, if I implement it by iterating over the indices, it works:
let mut arr = vec![0;10];
arr[0] = 1;
arr[1] = 1;
for i in 2..arr.len(){
arr[i] = arr[i-2] + arr[i-1];
}
println!("{:?}", arr);
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
So, my question, is there another way to solve this problem, or do I have to use the second version?
Real use-case
This code is to demonstrate my use-case and does not do anything on its own.
let mut labels = vec![vec![0; width]; height];
for (y, row) in labels.iter_mut().enumerate() {
for (x, label) in row.iter_mut().enumerate() {
let label_left = {
if x > 0 && some_condition() {
Some(labels[y][x - 1]) // <== Fails
} else {
None
}
};
let label_top = {
if y > 0 && some_condition() {
Some(labels[y - 1][x]) // <== Fails
} else {
None
}
};
*label = some_function(label_left, label_right);
}
}
Rewriting this with a 2D-index based iteration feels a lot like I'm trying to force C programming style into Rust, so I can't believe it's the intended way.

A more 'functional' way to implement your simplified use-case could be:
fn main() {
let mut arr = vec![0; 10];
arr[0] = 1;
arr[1] = 1;
let arr: Vec<i32> = arr
.iter()
.skip(2)
.scan((arr[0], arr[1]), |pair, _| {
let (a, b) = *pair;
let c = a + b;
*pair = (b, c);
Some(c)
})
.collect();
println!("{:?}", arr);
}
But this is not necessarily more rusty or easier to read than iterating with index. That said, if you are willing to go down the FP rabbit hole, it can be very rewarding.

Related

How to let several threads write to the same variable without mutex in Rust?

I am trying to implement an outer function that could calculate the outer product of two 1D arrays. Something like this:
use std::thread;
use ndarray::prelude::*;
pub fn multithread_outer(A: &Array1<f64>, B: &Array1<f64>) -> Array2<f64> {
let mut result = Array2::<f64>::default((A.len(), B.len()));
let thread_num = 5;
let n = A.len() / thread_num;
// a & b are ArcArray2<f64>
let a = A.to_owned().into_shared();
let b = B.to_owned().into_shared();
for i in 0..thread_num{
let a = a.clone();
let b = b.clone();
thread::spawn(move || {
for j in i * n..(i + 1) * n {
for k in 0..b.len() {
// This is the line I want to change
result[[j, k]] = a[j] * b[k];
}
}
});
}
// Use join to make sure all threads finish here
// Not so related to this question, so I didn't put it here
result
}
You can see that by design, two threads will never write to the same element. However, rust compiler will not allow two mutable references to the same result variable. And using mutex will make this much slower. What is the right way to implement this function?
While it is possible to do manually (with thread::scope and split_at_mut, for example), ndarray already has parallel iteration integrated into its library, based on rayon:
https://docs.rs/ndarray/latest/ndarray/parallel
Here is how your code would look like with parallel iterators:
use ndarray::parallel::prelude::*;
use ndarray::prelude::*;
pub fn multithread_outer(a: &Array1<f64>, b: &Array1<f64>) -> Array2<f64> {
let mut result = Array2::<f64>::default((a.len(), b.len()));
result
.axis_iter_mut(Axis(0))
.into_par_iter()
.enumerate()
.for_each(|(row_id, mut row)| {
for (col_id, cell) in row.iter_mut().enumerate() {
*cell = a[row_id] * b[col_id];
}
});
result
}
fn main() {
let a = Array1::from_vec(vec![1., 2., 3.]);
let b = Array1::from_vec(vec![4., 5., 6., 7.]);
let c = multithread_outer(&a, &b);
println!("{}", c)
}
[[4, 5, 6, 7],
[8, 10, 12, 14],
[12, 15, 18, 21]]

Is try_fold the preferred way to stop an infinite iteration or are there more idiomatic alternatives?

I'm looking for ways to interrupt iteration of an infinite iterator. I found that try_fold accomplishes my objective. However, that requires doing the awkward thing of returning an Err on the successful case. What I'm looking to understand is if this is an idiomatic way of doing things. The only other way I can think of doing this is using a regular for, or something like a find while keeping external state (which feels even weirder!). I know in clojure there's reduced, but I couldn't find an equivalent for rust.
Here's a minimum viable example. The example cycles around the initial Vec, summing each item as it goes, and stops at the first sum larger than 10. I.e. it returns 12, because 1 + 5 - 3 + 1 + 5 - 3 + 1 + 5 = 12:
fn main() {
let seed = vec![1, 5, -3];
let res = seed.iter().cycle().try_fold(0, |accum, value| {
let next = accum + value;
if next > 10 {
Err(next)
} else {
Ok(next)
}
});
if let Err(res) = res {
println!("{:?}", res);
} else {
unreachable!();
}
}
(playground link)
The part that feels weird to me is the if let Err(res) = res being the positive condition and really the only way out of the cycle (hence why the other branch is unreachable).
Is there a 'better' way?
Simple, just iterate normally
fn main() {
let seed = vec![1, 5, -3];
let mut accum = 0;
for value in seed.iter().cycle() {
accum += value;
if accum > 10 {
break;
}
}
println!("{:?}", accum)
}
You can use scan and find:
fn main() {
let seed = vec![1, 5, -3];
let res = seed
.iter()
.cycle()
.scan(0, |accum, value| {
*accum += value;
Some(*accum)
})
.find(|&x| x > 10)
.unwrap();
println!("{:?}", res);
}
This is actually slightly shorter than using fold_while from itertools, which would look like this:
use itertools::{FoldWhile::{Continue, Done}, Itertools};
fn main() {
let seed = vec![1, 5, -3];
let res = seed
.iter()
.cycle()
.fold_while(0, |accum, value| {
let next = accum + value;
if next > 10 {
Done(next)
} else {
Continue(next)
}
})
.into_inner();
println!("{:?}", res);
}
Consider using itertools::fold_while if you don't mind using a well established external library. There might be other useful extensions if you like this style of programming.
As of today, there is no obvious answer to your question which is actually very surprising. Like #peter-hall suggested, scan is the idiomatic functional answer to the problem but Rust's scan is known to be ugly to use.
In my opinion try_fold remains the rustiest option. Here are my combined suggestions to make looped folds more satisfying (pick one or more changes!) :
let res = {
use Result::{Err as Break, Ok as Next};
seed.iter()
.cycle()
.try_fold(0, |accum, value| match accum + value {
res if res > 10 => Break(res),
next => Next(next),
})
.unwrap_err()
};
Using a match helps naming the cases better than if/else in my opinion. The Result types aliases also give some semantics describing the breaking indention.
If you don't want to use unwrap_err, someone suggested to replace it by
let res = iter.try_fold(...).unwrap_or_else(|res| res);
It's also possible to use an irrefutable pattern:
let (Ok(res) | Err(res)) = iter.try_fold(...);
If it were for my own code, I'd keep things as simple and vanilla as possible:
let res = seed
.iter()
.cycle()
.try_fold(0, |accum, value| match accum + value {
res if res > 10 => Err(res),
acc => Ok(acc),
})
.unwrap_err();
Reading .unwrap_err() is clear enough to me because it means that the Err is the return type we expect. That's my best shot, cheers!

How can I group consecutive integers in a vector in Rust?

I have a Vec<i64> and I want to know all the groups of integers that are consecutive. As an example:
let v = vec![1, 2, 3, 5, 6, 7, 9, 10];
I'm expecting something like this or similar:
[[1, 2, 3], [5, 6, 7], [9, 10]];
The view (vector of vectors or maybe tuples or something else) really doesn't matter, but I should get several grouped lists with continuous numbers.
At the first look, it seems like I'll need to use itertools and the group_by function, but I have no idea how...
You can indeed use group_by for this, but you might not really want to. Here's what I would probably write instead:
fn consecutive_slices(data: &[i64]) -> Vec<&[i64]> {
let mut slice_start = 0;
let mut result = Vec::new();
for i in 1..data.len() {
if data[i - 1] + 1 != data[i] {
result.push(&data[slice_start..i]);
slice_start = i;
}
}
if data.len() > 0 {
result.push(&data[slice_start..]);
}
result
}
This is similar in principle to eXodiquas' answer, but instead of accumulating a Vec<Vec<i64>>, I use the indices to accumulate a Vec of slice references that refer to the original data. (This question explains why I made consecutive_slices take &[T].)
It's also possible to do the same thing without allocating a Vec, by returning an iterator; however, I like the above version better. Here's the zero-allocation version I came up with:
fn consecutive_slices(data: &[i64]) -> impl Iterator<Item = &[i64]> {
let mut slice_start = 0;
(1..=data.len()).flat_map(move |i| {
if i == data.len() || data[i - 1] + 1 != data[i] {
let begin = slice_start;
slice_start = i;
Some(&data[begin..i])
} else {
None
}
})
}
It's not as readable as a for loop, but it doesn't need to allocate a Vec for the return value, so this version is more flexible.
Here's a "more functional" version using group_by:
use itertools::Itertools;
fn consecutive_slices(data: &[i64]) -> Vec<Vec<i64>> {
(&(0..data.len()).group_by(|&i| data[i] as usize - i))
.into_iter()
.map(|(_, group)| group.map(|i| data[i]).collect())
.collect()
}
The idea is to make a key function for group_by that takes the difference between each element and its index in the slice. Consecutive elements will have the same key because indices increase by 1 each time. One reason I don't like this version is that it's quite difficult to get slices of the original data structure; you almost have to create a Vec<Vec<i64>> (hence the two collects). The other reason is that I find it harder to read.
However, when I first wrote my preferred version (the first one, with the for loop), it had a bug (now fixed), while the other two versions were correct from the start. So there may be merit to writing denser code with functional abstractions, even if there is some hit to readability and/or performance.
let v = vec![1, 2, 3, 5, 6, 7, 9, 10];
let mut res = Vec::new();
let mut prev = v[0];
let mut sub_v = Vec::new();
sub_v.push(prev);
for i in 1..v.len() {
if v[i] == prev + 1 {
sub_v.push(v[i]);
prev = v[i];
} else {
res.push(sub_v.clone());
sub_v.clear();
sub_v.push(v[i]);
prev = v[i];
}
}
res.push(sub_v);
This should solve your problem.
Iterating over the given vector, checking if the current i64 (in my case i32) is +1 to the previous i64, if so push it into a vector (sub_v). After the series breaks, push the sub_v into the result vector. Repeat.
But I guess you wanted something functional?
Another possible solution, that uses std only, could be:
fn consecutive_slices(v: &[i64]) -> Vec<Vec<i64>> {
let t: Vec<Vec<i64>> = v
.into_iter()
.chain([*v.last().unwrap_or(&-1)].iter())
.scan(Vec::new(), |s, &e| {
match s.last() {
None => { s.push(e); Some((false, Vec::new())) },
Some(&p) if p == e - 1 => { s.push(e); Some((false, Vec::new()))},
Some(&p) if p != e - 1 => {let o = s.clone(); *s = vec![e]; Some((true, o))},
_ => None,
}
})
.filter_map(|(n, v)| {
match n {
true => Some(v.clone()),
false => None,
}
})
.collect();
t
}
The chain is used to get the last vector.
I like the answers above but you could also use peekable() to tell if the next value is different.
https://doc.rust-lang.org/stable/std/iter/struct.Peekable.html
I would probably use a fold for this?
That's because I'm very much a functional programmer.
Obviously mutating the accumulator is weird :P but this works too and represents another way of thinking about it.
This is basically a recursive solution and can be modified easily to use immutable datastructures.
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=43b9e3613c16cb988da58f08724471a4
fn main() {
let v = vec![1, 2, 3, 5, 6, 7, 9, 10];
let mut res: Vec<Vec<i32>> = vec![];
let (last_group, _): (Vec<i32>, Option<i32>) = v
.iter()
.fold((vec![], None), |(mut cur_group, last), x| {
match last {
None => {
cur_group.push(*x);
(cur_group, Some(*x))
}
Some(last) => {
if x - last == 1 {
cur_group.push(*x);
(cur_group, Some(*x))
} else {
res.push(cur_group);
(vec![*x], Some(*x))
}
}
}
});
res.push(last_group);
println!("{:?}", res);
}

Best way to remove elements of Vec depending on other elements of the same Vec

I have a vector of sets and I want to remove all sets that are subsets of other sets in the vector. Example:
a = {0, 3, 5}
b = {0, 5}
c = {0, 2, 3}
In this case I would like to remove b, because it's a subset of a. I'm fine with using a "dumb" n² algorithm.
Sadly, it's pretty tricky to get it working with the borrow checker. The best I've come up with is (Playground):
let mut v: Vec<HashSet<u8>> = vec![];
let mut to_delete = Vec::new();
for (i, set_a) in v.iter().enumerate().rev() {
for set_b in &v[..i] {
if set_a.is_subset(&set_b) {
to_delete.push(i);
break;
}
}
}
for i in to_delete {
v.swap_remove(i);
}
(note: the code above is not correct! See comments for further details)
I see a few disadvantages:
I need an additional vector with additional allocations
Maybe there are more efficient ways than calling swap_remove often
If I need to preserve order, I can't use swap_remove, but have to use remove which is slow
Is there a better way to do this? I'm not just asking about my use case, but about the general case as it's described in the title.
Here is a solution that does not make additional allocations and preserves the order:
fn product_retain<T, F>(v: &mut Vec<T>, mut pred: F)
where F: FnMut(&T, &T) -> bool
{
let mut j = 0;
for i in 0..v.len() {
// invariants:
// items v[0..j] will be kept
// items v[j..i] will be removed
if (0..j).chain(i + 1..v.len()).all(|a| pred(&v[i], &v[a])) {
v.swap(i, j);
j += 1;
}
}
v.truncate(j);
}
fn main() {
// test with a simpler example
// unique elements
let mut v = vec![1, 2, 3];
product_retain(&mut v, |a, b| a != b);
assert_eq!(vec![1, 2, 3], v);
let mut v = vec![1, 3, 2, 4, 5, 1, 2, 4];
product_retain(&mut v, |a, b| a != b);
assert_eq!(vec![3, 5, 1, 2, 4], v);
}
This is a kind of partition algorithm. The elements in the first partition will be kept and in the second partition will be removed.
You can use a while loop instead of the for:
use std::collections::HashSet;
fn main() {
let arr: &[&[u8]] = &[
&[3],
&[1,2,3],
&[1,3],
&[1,4],
&[2,3]
];
let mut v:Vec<HashSet<u8>> = arr.iter()
.map(|x| x.iter().cloned().collect())
.collect();
let mut pos = 0;
while pos < v.len() {
let is_sub = v[pos+1..].iter().any(|x| v[pos].is_subset(x))
|| v[..pos].iter().any(|x| v[pos].is_subset(x));
if is_sub {
v.swap_remove(pos);
} else {
pos+=1;
}
}
println!("{:?}", v);
}
There are no additional allocations.
To avoid using remove and swap_remove, you can change the type of vector to Vec<Option<HashSet<u8>>>:
use std::collections::HashSet;
fn main() {
let arr: &[&[u8]] = &[
&[3],
&[1,2,3],
&[1,3],
&[1,4],
&[2,3]
];
let mut v:Vec<Option<HashSet<u8>>> = arr.iter()
.map(|x| Some(x.iter().cloned().collect()))
.collect();
for pos in 0..v.len(){
let is_sub = match v[pos].as_ref() {
Some(chk) =>
v[..pos].iter().flat_map(|x| x).any(|x| chk.is_subset(x))
|| v[pos+1..].iter().flat_map(|x| x).any(|x| chk.is_subset(x)),
None => false,
};
if is_sub { v[pos]=None };//Replace with None instead remove
}
println!("{:?}", v);//[None, Some({3, 2, 1}), None, Some({1, 4}), None]
}
I need an additional vector with additional allocations
I wouldn't worry about that allocation, since the memory and runtime footprint of that allocation will be really small compared to the rest of your algorithm.
Maybe there are more efficient ways than calling swap_remove often.
If I need to preserve order, I can't use swap_remove, but have to use remove which is slow
I'd change to_delete from Vec<usize> to Vec<bool> and just mark whether a particular hashmap should be removed. You can then use the Vec::retain, which conditionaly removes elements while preserving order. Unfortunately, this function doesn't pass the index to the closure, so we have to create a workaround (playground):
let mut to_delete = vec![false; v.len()];
for (i, set_a) in v.iter().enumerate().rev() {
for set_b in &v[..i] {
if set_a.is_subset(&set_b) {
to_delete[i] = true;
}
}
}
{
// This assumes that retain checks the elements in the order.
let mut i = 0;
v.retain(|_| {
let ret = !to_delete[i];
i += 1;
ret
});
}
If your hashmap has a special value which can never occur under normal conditions, you can use it to mark a hashmap as "to delete", and then check that condition in retain (it would require changing the outer loop from iterator-based to range-based though).
Sidenote (if that HashSet<u8> is not just a toy example): More eficient way to store and compare sets of small integers would be to use a bitset.

What is the idiomatic way to pop the last N elements in a mutable Vec?

I am contributing Rust code to RosettaCode to both learn Rust and contribute to the Rust community at the same time. What is the best idiomatic way to pop the last n elements in a mutable Vec?
Here's roughly what I have written but I'm wanting to see if there's a better way:
fn main() {
let mut nums: Vec<u32> = Vec::new();
nums.push(1);
nums.push(2);
nums.push(3);
nums.push(4);
nums.push(5);
let n = 2;
for _ in 0..n {
nums.pop();
}
for e in nums {
println!("{}", e)
}
}
(Playground link)
I'd recommend using Vec::truncate:
fn main() {
let mut nums = vec![1, 2, 3, 4, 5];
let n = 2;
let final_length = nums.len().saturating_sub(n);
nums.truncate(final_length);
println!("{:?}", nums);
}
Additionally, I
used saturating_sub to handle the case where there aren't N elements in the vector
used vec![] to construct the vector of numbers easily
printed out the entire vector in one go
Normally when you "pop" something, you want to have those values. If you want the values in another vector, you can use Vec::split_off:
let tail = nums.split_off(final_length);
If you want access to the elements but do not want to create a whole new vector, you can use Vec::drain:
for i in nums.drain(final_length..) {
println!("{}", i)
}
An alternate approach would be to use Vec::drain instead. This gives you an iterator so you can actually use the elements that are removed.
fn main() {
let mut nums: Vec<u32> = Vec::new();
nums.push(1);
nums.push(2);
nums.push(3);
nums.push(4);
nums.push(5);
let n = 2;
let new_len = nums.len() - n;
for removed_element in nums.drain(new_len..) {
println!("removed: {}", removed_element);
}
for retained_element in nums {
println!("retained: {}", retained_element);
}
}
The drain method accepts a RangeArgument in the form of <start-inclusive>..<end-exclusive>. Both start and end may be omitted to default to the beginning/end of the vector. So above, we're really just saying start at new_len and drain to the end.
You should take a look at the Vec::truncate function from the standard library, that can do this for you.
(playground)
fn main() {
let mut nums: Vec<u32> = Vec::new();
nums.push(1);
nums.push(2);
nums.push(3);
nums.push(4);
nums.push(5);
let n = 2;
let new_len = nums.len() - n;
nums.truncate(new_len);
for e in nums {
println!("{}", e)
}
}

Resources