Use mutable iterator twice - rust

I'm trying to write a general function that takes an iterable (or iterator) and iterates it twice, at least once mutably, like:
fn f(iter: I)
where I: Iterator<Item = &mut i32> + Clone {
for i in iter.clone() {
println!("{}", *i);
}
for i in iter.clone() {
*i += 1;
}
}
But it doesn't work because mutable iterators tend not to have clone() implemented, and for just reasons. My real world example is iteration over HashMap values, where std::collections::hash_map::ValuesMut is not Clone. Are there any ways to do it?

Unfortunately you are unable to do this. You will either need to merge them into a single for loop or save the items from the iterator to iterate over them again later.
The closest thing I could come up with is to use IntoIterator to require that the argument can be used to make a new iterator multiple times.
pub fn foo<'a, T>(iter: &'a mut T)
where for<'b> &'b mut T: IntoIterator<Item=&'a mut i32> {
for i in iter.into_iter() {
println!("{}", *i);
}
for i in iter.into_iter() {
*i += 1;
}
}
let mut map = HashMap::new();
map.insert(2, 5);
map.insert(6, 1);
map.insert(3, 4);
foo(&mut map.values_mut())
However, it seems like much less of a headache for you if you just pass a reference to the entire map.
pub fn bar<T>(map: &mut HashMap<T, i32>) {
for i in map.values() {
println!("{}", *i);
}
for i in map.values_mut() {
*i += 1;
}
}

Related

Lockless processing of non overlapping non contiguous indexes by multiple threads in Rust

I am practicing rust and decided to create a Matrix ops/factorization project.
Basically I want to be able to process the underlying vector in multiple threads. Since I will be providing each thread non-overlapping indexes (which may or may not be contiguous) and the threads will be joined before the end of whatever function created them, there is no need for a lock /synchronization.
I know that there are several crates that can do this, but I would like to know if there is a relatively idiomatic crate-free way to implement it on my own.
The best I could come up with is (simplified the code a bit):
use std::thread;
//This represents the Matrix
#[derive(Debug, Clone)]
pub struct MainStruct {
pub data: Vec<f64>,
}
//This is the bit that will be shared by the threads,
//ideally it should have its lifetime tied to that of MainStruct
//but i have no idea how to make phantomdata work in this case
#[derive(Debug, Clone)]
pub struct SliceTest {
pub data: Vec<SubSlice>,
}
//This struct is to hide *mut f64 to allow it to be shared to other threads
#[derive(Debug, Clone)]
pub struct SubSlice {
pub data: *mut f64,
}
impl MainStruct {
pub fn slice(&mut self) -> (SliceTest, SliceTest) {
let mut out_vec_odd: Vec<SubSlice> = Vec::new();
let mut out_vec_even: Vec<SubSlice> = Vec::new();
unsafe {
let ptr = self.data.as_mut_ptr();
for i in 0..self.data.len() {
let ptr_to_push = ptr.add(i);
//Non contiguous idxs
if i % 2 == 0 {
out_vec_even.push(SubSlice{data:ptr_to_push});
} else {
out_vec_odd.push(SubSlice{data:ptr_to_push});
}
}
}
(SliceTest{data: out_vec_even}, SliceTest{data: out_vec_odd})
}
}
impl SubSlice {
pub fn set(&self, val: f64) {
unsafe {*(self.data) = val;}
}
}
unsafe impl Send for SliceTest {}
unsafe impl Send for SubSlice {}
fn main() {
let mut maindata = MainStruct {
data: vec![0.0, 1.0, 2.0, 3.0, 4.0, 5.0],
};
let (mut outvec1, mut outvec2) = maindata.slice();
let mut threads = Vec::new();
threads.push(
thread::spawn(move || {
for i in 0..outvec1.data.len() {
outvec1.data[i].set(999.9);
}
})
);
threads.push(
thread::spawn(move || {
for i in 0..outvec2.data.len() {
outvec2.data[i].set(999.9);
}
})
);
for handles in threads {
handles.join();
}
println!("maindata = {:?}", maindata.data);
}
EDIT:
Following kmdreko suggestion below, got the code to work exactly how I wanted it without using unsafe code, yay!
Of course in terms of performance it may be cheaper to copy the f64 slices than to create mutable reference vectors unless your struct is filled with other structs instead of f64
extern crate crossbeam;
use crossbeam::thread;
#[derive(Debug, Clone)]
pub struct Matrix {
data: Vec<f64>,
m: usize, //number of rows
n: usize, //number of cols
}
...
impl Matrix {
...
pub fn get_data_mut(&mut self) -> &mut Vec<f64> {
&mut self.data
}
pub fn calculate_idx(max_cols: usize, i: usize, j: usize) -> usize {
let actual_idx = j + max_cols * i;
actual_idx
}
//Get individual mutable references for contiguous indexes (rows)
pub fn get_all_row_slices(&mut self) -> Vec<Vec<&mut f64>> {
let max_cols = self.max_cols();
let max_rows = self.max_rows();
let inner_data = self.get_data_mut().chunks_mut(max_cols);
let mut out_vec: Vec<Vec<&mut f64>> = Vec::with_capacity(max_rows);
for chunk in inner_data {
let row_vec = chunk.iter_mut().collect();
out_vec.push(row_vec);
}
out_vec
}
//Get mutable references for disjoint indexes (columns)
pub fn get_all_col_slices(&mut self) -> Vec<Vec<&mut f64>> {
let max_cols = self.max_cols();
let max_rows = self.max_rows();
let inner_data = self.get_data_mut().chunks_mut(max_cols);
let mut out_vec: Vec<Vec<&mut f64>> = Vec::with_capacity(max_cols);
for _ in 0..max_cols {
out_vec.push(Vec::with_capacity(max_rows));
}
let mut inner_idx = 0;
for chunk in inner_data {
let row_vec_it = chunk.iter_mut();
for elem in row_vec_it {
out_vec[inner_idx].push(elem);
inner_idx += 1;
}
inner_idx = 0;
}
out_vec
}
...
}
fn test_multithreading() {
fn test(in_vec: Vec<&mut f64>) {
for elem in in_vec {
*elem = 33.3;
}
}
fn launch_task(mat: &mut Matrix, f: fn(Vec<&mut f64>)) {
let test_vec = mat.get_all_row_slices();
thread::scope(|s| {
for elem in test_vec.into_iter() {
s.spawn(move |_| {
println!("Spawning thread...");
f(elem);
});
}
}).unwrap();
}
let rows = 4;
let cols = 3;
//new function code omitted, returns Result<Self, MatrixError>
let mut mat = Matrix::new(rows, cols).unwrap()
launch_task(&mut mat, test);
for i in 0..rows {
for j in 0..cols {
//Requires index trait implemented for matrix
assert_eq!(mat[(i, j)], 33.3);
}
}
}
This API is unsound. Since there is no lifetime annotation binding SliceTest and SubSlice to the MainStruct, they can be preserved after the data has been destroyed and if used would result in use-after-free errors.
Its easy to make it safe though; you can use .iter_mut() to get distinct mutable references to your elements:
pub fn slice(&mut self) -> (Vec<&mut f64>, Vec<&mut f64>) {
let mut out_vec_even = vec![];
let mut out_vec_odd = vec![];
for (i, item_ref) in self.data.iter_mut().enumerate() {
if i % 2 == 0 {
out_vec_even.push(item_ref);
} else {
out_vec_odd.push(item_ref);
}
}
(out_vec_even, out_vec_odd)
}
However, this surfaces another problem: thread::spawn cannot hold references to local variables. The threads created are allowed to live beyond the scope they're created in, so even though you did .join() them, you aren't required to. This was a potential issue in your original code as well, just the compiler couldn't warn about it.
There's no easy way to solve this. You'd need to use a non-referential way to use data on the other threads, but that would be using Arc, which doesn't allow mutating its data, so you'd have to resort to a Mutex, which is what you've tried to avoid.
I would suggest reaching for scope from the crossbeam crate, which does allow you to spawn threads that reference local data. I know you've wanted to avoid using crates, but this is the best solution in my opinion.
See a working version on the playground.
See:
How to get multiple mutable references to elements in a Vec?
Can you specify a non-static lifetime for threads?

Optionally call `skip` in a custom iterator `next()` function

I have a custom iterator and I would like to optionally call .skip(...) in the custom .next() method. However, I get a type error because Skip != Iterator.
Sample code is as follows:
struct CrossingIter<'a, T> {
index: usize,
iter: std::slice::Iter<'a, T>,
}
impl<'a, T: Float> Iterator for CrossingIter<'a, T> {
type Item = (usize, T);
fn next(&mut self) -> Option<(usize, T)> {
let iter = (&mut self.iter).enumerate();
let iter = if self.index == 0 {
self.index += 3;
iter.skip(3)
} else {
iter
}
// lots of code here working with the new iterator
iter.next()
}
}
The issue is that after calling .skip(3), the type of iter has changed. One solution would be to duplicate the // lots of code ... in each branch of the if statement, but I'd rather not.
My question is: Is there a way to conditionally apply skip(...) to an iterator and continue working with it without duplicating a bunch of code?
skip is designed to construct a new iterator, which is very useful in situations where you want your code to remain, at least on the surface, immutable. However, in your case, you want to advance the existing iterator while still leaving it valid.
There is advance_by which does what you want, but it's Nightly so it won't run on Stable Rust.
if self.index == 0 {
self.index += 3;
self.iter.advance_by(3);
}
We can abuse nth to get what we want, but it's not very idiomatic.
if self.index == 0 {
self.index += 3;
self.iter.nth(2);
}
If I saw that code in production, I'd be quite puzzled.
The simplest and not terribly satisfying answer is to just reimplement advance_by as a helper function. The source is available and pretty easy to adapt
fn my_advance_by(iter: &mut impl Iterator, n: usize) -> Result<(), usize> {
for i in 0..n {
iter.next().ok_or(i)?;
}
Ok(())
}
All this being said, if your use case is actually just to skip the first three elements, all you need is to start with the skip call and assume your iterator is always Skip
struct CrossingIter<'a, T> {
index: usize,
iter: std::iter::Skip<std::slice::Iter<'a, T>>,
}
I think #Silvio's answer is a better perspective.
You may call skip(0) instead of the iter itself in else branch...
And the return value of the iterator generated by enumerate doesn't match your definition: fn next(&mut self) -> Option<(usize, T)>. You need to map it.
Here is a working example:
use num::Float;
struct CrossingIter<'a, T> {
index: usize,
iter: std::slice::Iter<'a, T>,
}
impl<'a, T: Float> Iterator for CrossingIter<'a, T> {
type Item = (usize, T);
fn next(&mut self) -> Option<(usize, T)> {
let iter = (&mut self.iter).enumerate();
let mut iter = if self.index == 0 {
self.index += 3;
iter.skip(3)
} else {
iter.skip(0)
};
// lots of code here working with the new iterator
iter.next().map(|(i, &v)| (i, v))
}
}

Mutable borrow into two parts with cleanup

I have some object that I want to split into two parts via a mutable borrow, then combine those back together into the original object when the split references go out of scope.
The simplified example below is for a Count struct that holds a single i32, which we want to split into two &mut i32s, who are both incorporated back into the original Count when the two mutable references go out of scope.
The approach I am taking below is to use an intermediate object CountSplit which holds a mutable reference to the original Count object and has the Drop trait implemented to do the re-combination logic.
This approach feels kludgy. In particular, this is awkward:
let mut ms = c.make_split();
let (x, y) = ms.split();
Doing this in one line like let (x, y) = c.make_split().split(); is not allowed because the intermediate object must have a longer lifetime. Ideally I would be able to do something like let (x, y) = c.magic_split(); and avoid exposing the intermediate object altogether.
Is there a way to do this which doesn't require doing two let's every time, or some other way to tackle this pattern that would be more idiomatic?
#[derive(Debug)]
struct Count {
val: i32,
}
trait MakeSplit<'a> {
type S: Split<'a>;
fn make_split(&'a mut self) -> Self::S;
}
impl<'a> MakeSplit<'a> for Count {
type S = CountSplit<'a>;
fn make_split(&mut self) -> CountSplit {
CountSplit {
top: self,
second: 0,
}
}
}
struct CountSplit<'a> {
top: &'a mut Count,
second: i32,
}
trait Split<'a> {
fn split(&'a mut self) -> (&'a mut i32, &'a mut i32);
}
impl<'a, 'b> Split<'a> for CountSplit<'b> {
fn split(&mut self) -> (&mut i32, &mut i32) {
(&mut self.top.val, &mut self.second)
}
}
impl<'a> Drop for CountSplit<'a> {
fn drop(&mut self) {
println!("custom drop occurs here");
self.top.val += self.second;
}
}
fn main() {
let mut c = Count { val: 2 };
println!("{:?}", c); // Count { val: 2 }
{
let mut ms = c.make_split();
let (x, y) = ms.split();
println!("split: {} {}", x, y); // split: 2 0
// each of these lines correctly gives a compile-time error
// c.make_split(); // can't borrow c as mutable
// println!("{:?}", c); // or immutable
// ms.split(); // also can't borrow ms
*x += 100;
*y += 5000;
println!("split: {} {}", x, y); // split: 102 5000
} // custom drop occurs here
println!("{:?}", c); // Count { val: 5102 }
}
playground:
I don't think a reference to a temporary value like yours can be made to work in today's Rust.
If it's any help, if you specifically want to call a function with two &mut i32 parameters like you mentioned in the comments, e.g.
fn foo(a: &mut i32, b: &mut i32) {
*a += 1;
*b += 2;
println!("split: {} {}", a, b);
}
you can already do that with the same number of lines as you'd have if your chaining worked.
With the chaining, you'd call
let (x, y) = c.make_split().split();
foo(x, y);
And if you just leave out the conversion to a tuple, it looks like this:
let mut ms = c.make_split();
foo(&mut ms.top.val, &mut ms.second);
You can make it a little prettier by e.g. storing the mutable reference to val directly in CountSplit as first, so that it becomes foo(&mut ms.first, &mut ms.second);. If you want it to feel even more like a tuple, I think you can use DerefMut to be able to write foo(&mut ms.0, &mut ms.1);.
Alternatively, you can of course formulate this as a function taking a function
impl Count {
fn as_split<F: FnMut(&mut i32, &mut i32)>(&mut self, mut f: F) {
let mut second = 0;
f(&mut self.val, &mut second);
self.val += second;
}
}
and then just call
c.as_split(foo);

borrowed value does not live long enough in this case ( Vec<&Fn(i32) -> i32> )

I am having this error, other times I had something similar and I have been able to solve, in different ways but now is not how to solve in this case:
borrowed value does not live long enough in
I moved the code that fails one more simple, but I can not find the error:
fn main(){
let mut v: Vec<&Fn(i32) -> i32> = Vec::new();
v.push(&ops_code1);
//v.push(&ops_code2);
//v.push(&ops_code3);
}
fn ops_code1(value: i32) -> i32 {
..//
error: borrowed value does not live long enough
v.push(&ops_code1);
play.rust
What you are doing here is creating a Vec of closures. In Rust static functions are treated slightly differently from closures, so when we create the reference a closure is actually created. If we do that after creating the Vec the resulting closure gets a shorter lifetime than the Vec, which is an error. We can instead use a let to create the closure before the Vec, giving a long enough lifetime, outliving the Vec:
fn main() {
let extended = &ops_code1;
let mut v: Vec<&Fn(i32) -> i32> = Vec::new();
// Note that placing it here does not work:
// let extended = &ops_code1;
v.push(extended);
//v.push(&ops_code2);
//v.push(&ops_code3);
}
fn ops_code1(value: i32) -> i32 {
println!("ops_code1 {}", value);
value
}
Rust Playground
However, if you only use static functions - and not closures - the following also works fine, and lets you avoid the extra let:
fn main() {
let mut v: Vec<fn(i32) -> i32> = Vec::new();
v.push(ops_code1);
v.push(ops_code2);
}
fn ops_code1(value: i32) -> i32 {
println!("ops_code1 {}", value);
value
}
fn ops_code2(value: i32) -> i32 {
println!("ops_code2 {}", value);
value
}
Rust Playground
A third option is to use boxed closures, which let's you use both closures and static functions without the extra lets, but with its own trade-offs:
fn main() {
let mut v: Vec<Box<Fn(i32) -> i32>> = Vec::new();
v.push(Box::new(ops_code1));
v.push(Box::new(ops_code2));
for f in v {
f(1);
}
}
fn ops_code1(value: i32) -> i32 {
println!("ops_code1 {}", value);
value
}
fn ops_code2(value: i32) -> i32 {
println!("ops_code2 {}", value);
value
}
Rust Playground

How do I iterate over a range with a custom step?

How can I iterate over a range in Rust with a step other than 1? I'm coming from a C++ background so I'd like to do something like
for(auto i = 0; i <= n; i+=2) {
//...
}
In Rust I need to use the range function, and it doesn't seem like there is a third argument available for having a custom step. How can I accomplish this?
range_step_inclusive and range_step are long gone.
As of Rust 1.28, Iterator::step_by is stable:
fn main() {
for x in (1..10).step_by(2) {
println!("{}", x);
}
}
It seems to me that until the .step_by method is made stable, one can easily accomplish what you want with an Iterator (which is what Ranges really are anyway):
struct SimpleStepRange(isize, isize, isize); // start, end, and step
impl Iterator for SimpleStepRange {
type Item = isize;
#[inline]
fn next(&mut self) -> Option<isize> {
if self.0 < self.1 {
let v = self.0;
self.0 = v + self.2;
Some(v)
} else {
None
}
}
}
fn main() {
for i in SimpleStepRange(0, 10, 2) {
println!("{}", i);
}
}
If one needs to iterate multiple ranges of different types, the code can be made generic as follows:
use std::ops::Add;
struct StepRange<T>(T, T, T)
where for<'a> &'a T: Add<&'a T, Output = T>,
T: PartialOrd,
T: Clone;
impl<T> Iterator for StepRange<T>
where for<'a> &'a T: Add<&'a T, Output = T>,
T: PartialOrd,
T: Clone
{
type Item = T;
#[inline]
fn next(&mut self) -> Option<T> {
if self.0 < self.1 {
let v = self.0.clone();
self.0 = &v + &self.2;
Some(v)
} else {
None
}
}
}
fn main() {
for i in StepRange(0u64, 10u64, 2u64) {
println!("{}", i);
}
}
I'll leave it to you to eliminate the upper bounds check to create an open ended structure if an infinite loop is required...
Advantages of this approach is that is works with for sugaring and will continue to work even when unstable features become usable; also, unlike the de-sugared approach using the standard Ranges, it doesn't lose efficiency by multiple .next() calls. Disadvantages are that it takes a few lines of code to set up the iterator so may only be worth it for code that has a lot of loops.
If you are stepping by something predefined, and small like 2, you may wish to use the iterator to step manually. e.g.:
let mut iter = 1..10;
loop {
match iter.next() {
Some(x) => {
println!("{}", x);
},
None => break,
}
iter.next();
}
You could even use this to step by an arbitrary amount (although this is definitely getting longer and harder to digest):
let mut iter = 1..10;
let step = 4;
loop {
match iter.next() {
Some(x) => {
println!("{}", x);
},
None => break,
}
for _ in 0..step-1 {
iter.next();
}
}
Use the num crate with range_step
You'd write your C++ code:
for (auto i = 0; i <= n; i += 2) {
//...
}
...in Rust like so:
let mut i = 0;
while i <= n {
// ...
i += 2;
}
I think the Rust version is more readable too.

Resources