How do I iterate over a range with a custom step? - rust

How can I iterate over a range in Rust with a step other than 1? I'm coming from a C++ background so I'd like to do something like
for(auto i = 0; i <= n; i+=2) {
//...
}
In Rust I need to use the range function, and it doesn't seem like there is a third argument available for having a custom step. How can I accomplish this?

range_step_inclusive and range_step are long gone.
As of Rust 1.28, Iterator::step_by is stable:
fn main() {
for x in (1..10).step_by(2) {
println!("{}", x);
}
}

It seems to me that until the .step_by method is made stable, one can easily accomplish what you want with an Iterator (which is what Ranges really are anyway):
struct SimpleStepRange(isize, isize, isize); // start, end, and step
impl Iterator for SimpleStepRange {
type Item = isize;
#[inline]
fn next(&mut self) -> Option<isize> {
if self.0 < self.1 {
let v = self.0;
self.0 = v + self.2;
Some(v)
} else {
None
}
}
}
fn main() {
for i in SimpleStepRange(0, 10, 2) {
println!("{}", i);
}
}
If one needs to iterate multiple ranges of different types, the code can be made generic as follows:
use std::ops::Add;
struct StepRange<T>(T, T, T)
where for<'a> &'a T: Add<&'a T, Output = T>,
T: PartialOrd,
T: Clone;
impl<T> Iterator for StepRange<T>
where for<'a> &'a T: Add<&'a T, Output = T>,
T: PartialOrd,
T: Clone
{
type Item = T;
#[inline]
fn next(&mut self) -> Option<T> {
if self.0 < self.1 {
let v = self.0.clone();
self.0 = &v + &self.2;
Some(v)
} else {
None
}
}
}
fn main() {
for i in StepRange(0u64, 10u64, 2u64) {
println!("{}", i);
}
}
I'll leave it to you to eliminate the upper bounds check to create an open ended structure if an infinite loop is required...
Advantages of this approach is that is works with for sugaring and will continue to work even when unstable features become usable; also, unlike the de-sugared approach using the standard Ranges, it doesn't lose efficiency by multiple .next() calls. Disadvantages are that it takes a few lines of code to set up the iterator so may only be worth it for code that has a lot of loops.

If you are stepping by something predefined, and small like 2, you may wish to use the iterator to step manually. e.g.:
let mut iter = 1..10;
loop {
match iter.next() {
Some(x) => {
println!("{}", x);
},
None => break,
}
iter.next();
}
You could even use this to step by an arbitrary amount (although this is definitely getting longer and harder to digest):
let mut iter = 1..10;
let step = 4;
loop {
match iter.next() {
Some(x) => {
println!("{}", x);
},
None => break,
}
for _ in 0..step-1 {
iter.next();
}
}

Use the num crate with range_step

You'd write your C++ code:
for (auto i = 0; i <= n; i += 2) {
//...
}
...in Rust like so:
let mut i = 0;
while i <= n {
// ...
i += 2;
}
I think the Rust version is more readable too.

Related

Iterator that skips every nth element

Rather than taking every Nth element from an iterator which I can do with Iterator::step_by, I would like to skip every Nth element. How can I achieve this idiomatically? Is there maybe even a standard library or itertools function?
This is what I came up with to skip every 7th say. It requires enumerate, filter, and map, though one could use a filter_map instead of the latter two.
(0..100).enumerate()
.filter(|&(i, x)| (i + 1) % 7 != 0)
.map(|(i, x)| x);
How could I cast this into a function so that I could simply write:
(0..100).skip_every(7)
If you want to get the exact interface you asked for, your best option at this time is to implement a custom iterator adapter type. Here's a basic version of such a type:
pub struct SkipEvery<I> {
inner: I,
every: usize,
index: usize,
}
impl<I> SkipEvery<I> {
fn new(inner: I, every: usize) -> Self {
assert!(every > 1);
let index = 0;
Self {
inner,
every,
index,
}
}
}
impl<I: Iterator> Iterator for SkipEvery<I> {
type Item = I::Item;
fn next(&mut self) -> Option<Self::Item> {
if self.index == self.every - 1 {
self.index = 1;
self.inner.nth(1)
} else {
self.index += 1;
self.inner.next()
}
}
}
pub trait IteratorSkipEveryExt: Iterator + Sized {
fn skip_every(self, every: usize) -> SkipEvery<Self> {
SkipEvery::new(self, every)
}
}
impl<I: Iterator + Sized> IteratorSkipEveryExt for I {}
(Playground)
A more complete implementation could also add optimized versions of further Iterator methods, as well as implementations of DoubleEndedIterator and ExactSizeIterator -- see the implementation of StepBy as an example.
Your code is pretty easy to turn into a function:
fn skip_every<I: Iterator> (iter: I, n: usize) -> impl Iterator<Item = <I as Iterator>::Item> {
iter.enumerate()
.filter_map(move |(i, v)| if (i + 1) % n != 0 { Some (v) } else { None })
}
fn main() {
println!("{:?}", skip_every (0..20, 7).collect::<Vec<_>>());
}
Playground
Or avoiding the expensive modulo:
fn skip_every2<I: Iterator> (iter: I, n: usize) -> impl Iterator<Item = <I as Iterator>::Item> {
iter.zip ((0..n).rev().cycle()).filter_map (|(v, i)| if i != 0 { Some (v) } else { None })
}
Playground

Use mutable iterator twice

I'm trying to write a general function that takes an iterable (or iterator) and iterates it twice, at least once mutably, like:
fn f(iter: I)
where I: Iterator<Item = &mut i32> + Clone {
for i in iter.clone() {
println!("{}", *i);
}
for i in iter.clone() {
*i += 1;
}
}
But it doesn't work because mutable iterators tend not to have clone() implemented, and for just reasons. My real world example is iteration over HashMap values, where std::collections::hash_map::ValuesMut is not Clone. Are there any ways to do it?
Unfortunately you are unable to do this. You will either need to merge them into a single for loop or save the items from the iterator to iterate over them again later.
The closest thing I could come up with is to use IntoIterator to require that the argument can be used to make a new iterator multiple times.
pub fn foo<'a, T>(iter: &'a mut T)
where for<'b> &'b mut T: IntoIterator<Item=&'a mut i32> {
for i in iter.into_iter() {
println!("{}", *i);
}
for i in iter.into_iter() {
*i += 1;
}
}
let mut map = HashMap::new();
map.insert(2, 5);
map.insert(6, 1);
map.insert(3, 4);
foo(&mut map.values_mut())
However, it seems like much less of a headache for you if you just pass a reference to the entire map.
pub fn bar<T>(map: &mut HashMap<T, i32>) {
for i in map.values() {
println!("{}", *i);
}
for i in map.values_mut() {
*i += 1;
}
}

Lockless processing of non overlapping non contiguous indexes by multiple threads in Rust

I am practicing rust and decided to create a Matrix ops/factorization project.
Basically I want to be able to process the underlying vector in multiple threads. Since I will be providing each thread non-overlapping indexes (which may or may not be contiguous) and the threads will be joined before the end of whatever function created them, there is no need for a lock /synchronization.
I know that there are several crates that can do this, but I would like to know if there is a relatively idiomatic crate-free way to implement it on my own.
The best I could come up with is (simplified the code a bit):
use std::thread;
//This represents the Matrix
#[derive(Debug, Clone)]
pub struct MainStruct {
pub data: Vec<f64>,
}
//This is the bit that will be shared by the threads,
//ideally it should have its lifetime tied to that of MainStruct
//but i have no idea how to make phantomdata work in this case
#[derive(Debug, Clone)]
pub struct SliceTest {
pub data: Vec<SubSlice>,
}
//This struct is to hide *mut f64 to allow it to be shared to other threads
#[derive(Debug, Clone)]
pub struct SubSlice {
pub data: *mut f64,
}
impl MainStruct {
pub fn slice(&mut self) -> (SliceTest, SliceTest) {
let mut out_vec_odd: Vec<SubSlice> = Vec::new();
let mut out_vec_even: Vec<SubSlice> = Vec::new();
unsafe {
let ptr = self.data.as_mut_ptr();
for i in 0..self.data.len() {
let ptr_to_push = ptr.add(i);
//Non contiguous idxs
if i % 2 == 0 {
out_vec_even.push(SubSlice{data:ptr_to_push});
} else {
out_vec_odd.push(SubSlice{data:ptr_to_push});
}
}
}
(SliceTest{data: out_vec_even}, SliceTest{data: out_vec_odd})
}
}
impl SubSlice {
pub fn set(&self, val: f64) {
unsafe {*(self.data) = val;}
}
}
unsafe impl Send for SliceTest {}
unsafe impl Send for SubSlice {}
fn main() {
let mut maindata = MainStruct {
data: vec![0.0, 1.0, 2.0, 3.0, 4.0, 5.0],
};
let (mut outvec1, mut outvec2) = maindata.slice();
let mut threads = Vec::new();
threads.push(
thread::spawn(move || {
for i in 0..outvec1.data.len() {
outvec1.data[i].set(999.9);
}
})
);
threads.push(
thread::spawn(move || {
for i in 0..outvec2.data.len() {
outvec2.data[i].set(999.9);
}
})
);
for handles in threads {
handles.join();
}
println!("maindata = {:?}", maindata.data);
}
EDIT:
Following kmdreko suggestion below, got the code to work exactly how I wanted it without using unsafe code, yay!
Of course in terms of performance it may be cheaper to copy the f64 slices than to create mutable reference vectors unless your struct is filled with other structs instead of f64
extern crate crossbeam;
use crossbeam::thread;
#[derive(Debug, Clone)]
pub struct Matrix {
data: Vec<f64>,
m: usize, //number of rows
n: usize, //number of cols
}
...
impl Matrix {
...
pub fn get_data_mut(&mut self) -> &mut Vec<f64> {
&mut self.data
}
pub fn calculate_idx(max_cols: usize, i: usize, j: usize) -> usize {
let actual_idx = j + max_cols * i;
actual_idx
}
//Get individual mutable references for contiguous indexes (rows)
pub fn get_all_row_slices(&mut self) -> Vec<Vec<&mut f64>> {
let max_cols = self.max_cols();
let max_rows = self.max_rows();
let inner_data = self.get_data_mut().chunks_mut(max_cols);
let mut out_vec: Vec<Vec<&mut f64>> = Vec::with_capacity(max_rows);
for chunk in inner_data {
let row_vec = chunk.iter_mut().collect();
out_vec.push(row_vec);
}
out_vec
}
//Get mutable references for disjoint indexes (columns)
pub fn get_all_col_slices(&mut self) -> Vec<Vec<&mut f64>> {
let max_cols = self.max_cols();
let max_rows = self.max_rows();
let inner_data = self.get_data_mut().chunks_mut(max_cols);
let mut out_vec: Vec<Vec<&mut f64>> = Vec::with_capacity(max_cols);
for _ in 0..max_cols {
out_vec.push(Vec::with_capacity(max_rows));
}
let mut inner_idx = 0;
for chunk in inner_data {
let row_vec_it = chunk.iter_mut();
for elem in row_vec_it {
out_vec[inner_idx].push(elem);
inner_idx += 1;
}
inner_idx = 0;
}
out_vec
}
...
}
fn test_multithreading() {
fn test(in_vec: Vec<&mut f64>) {
for elem in in_vec {
*elem = 33.3;
}
}
fn launch_task(mat: &mut Matrix, f: fn(Vec<&mut f64>)) {
let test_vec = mat.get_all_row_slices();
thread::scope(|s| {
for elem in test_vec.into_iter() {
s.spawn(move |_| {
println!("Spawning thread...");
f(elem);
});
}
}).unwrap();
}
let rows = 4;
let cols = 3;
//new function code omitted, returns Result<Self, MatrixError>
let mut mat = Matrix::new(rows, cols).unwrap()
launch_task(&mut mat, test);
for i in 0..rows {
for j in 0..cols {
//Requires index trait implemented for matrix
assert_eq!(mat[(i, j)], 33.3);
}
}
}
This API is unsound. Since there is no lifetime annotation binding SliceTest and SubSlice to the MainStruct, they can be preserved after the data has been destroyed and if used would result in use-after-free errors.
Its easy to make it safe though; you can use .iter_mut() to get distinct mutable references to your elements:
pub fn slice(&mut self) -> (Vec<&mut f64>, Vec<&mut f64>) {
let mut out_vec_even = vec![];
let mut out_vec_odd = vec![];
for (i, item_ref) in self.data.iter_mut().enumerate() {
if i % 2 == 0 {
out_vec_even.push(item_ref);
} else {
out_vec_odd.push(item_ref);
}
}
(out_vec_even, out_vec_odd)
}
However, this surfaces another problem: thread::spawn cannot hold references to local variables. The threads created are allowed to live beyond the scope they're created in, so even though you did .join() them, you aren't required to. This was a potential issue in your original code as well, just the compiler couldn't warn about it.
There's no easy way to solve this. You'd need to use a non-referential way to use data on the other threads, but that would be using Arc, which doesn't allow mutating its data, so you'd have to resort to a Mutex, which is what you've tried to avoid.
I would suggest reaching for scope from the crossbeam crate, which does allow you to spawn threads that reference local data. I know you've wanted to avoid using crates, but this is the best solution in my opinion.
See a working version on the playground.
See:
How to get multiple mutable references to elements in a Vec?
Can you specify a non-static lifetime for threads?

Optionally call `skip` in a custom iterator `next()` function

I have a custom iterator and I would like to optionally call .skip(...) in the custom .next() method. However, I get a type error because Skip != Iterator.
Sample code is as follows:
struct CrossingIter<'a, T> {
index: usize,
iter: std::slice::Iter<'a, T>,
}
impl<'a, T: Float> Iterator for CrossingIter<'a, T> {
type Item = (usize, T);
fn next(&mut self) -> Option<(usize, T)> {
let iter = (&mut self.iter).enumerate();
let iter = if self.index == 0 {
self.index += 3;
iter.skip(3)
} else {
iter
}
// lots of code here working with the new iterator
iter.next()
}
}
The issue is that after calling .skip(3), the type of iter has changed. One solution would be to duplicate the // lots of code ... in each branch of the if statement, but I'd rather not.
My question is: Is there a way to conditionally apply skip(...) to an iterator and continue working with it without duplicating a bunch of code?
skip is designed to construct a new iterator, which is very useful in situations where you want your code to remain, at least on the surface, immutable. However, in your case, you want to advance the existing iterator while still leaving it valid.
There is advance_by which does what you want, but it's Nightly so it won't run on Stable Rust.
if self.index == 0 {
self.index += 3;
self.iter.advance_by(3);
}
We can abuse nth to get what we want, but it's not very idiomatic.
if self.index == 0 {
self.index += 3;
self.iter.nth(2);
}
If I saw that code in production, I'd be quite puzzled.
The simplest and not terribly satisfying answer is to just reimplement advance_by as a helper function. The source is available and pretty easy to adapt
fn my_advance_by(iter: &mut impl Iterator, n: usize) -> Result<(), usize> {
for i in 0..n {
iter.next().ok_or(i)?;
}
Ok(())
}
All this being said, if your use case is actually just to skip the first three elements, all you need is to start with the skip call and assume your iterator is always Skip
struct CrossingIter<'a, T> {
index: usize,
iter: std::iter::Skip<std::slice::Iter<'a, T>>,
}
I think #Silvio's answer is a better perspective.
You may call skip(0) instead of the iter itself in else branch...
And the return value of the iterator generated by enumerate doesn't match your definition: fn next(&mut self) -> Option<(usize, T)>. You need to map it.
Here is a working example:
use num::Float;
struct CrossingIter<'a, T> {
index: usize,
iter: std::slice::Iter<'a, T>,
}
impl<'a, T: Float> Iterator for CrossingIter<'a, T> {
type Item = (usize, T);
fn next(&mut self) -> Option<(usize, T)> {
let iter = (&mut self.iter).enumerate();
let mut iter = if self.index == 0 {
self.index += 3;
iter.skip(3)
} else {
iter.skip(0)
};
// lots of code here working with the new iterator
iter.next().map(|(i, &v)| (i, v))
}
}

How to define mutual recursion with closures?

I can do something like this:
fn func() -> (Vec<i32>, Vec<i32>) {
let mut u = vec![0;5];
let mut v = vec![0;5];
fn foo(u: &mut [i32], v: &mut [i32], i: usize, j: usize) {
for k in i+1..u.len() {
u[k] += 1;
bar(u, v, k, j);
}
}
fn bar(u: &mut [i32], v: &mut [i32], i: usize, j: usize) {
for k in j+1..v.len() {
v[k] += 1;
foo(u, v, i, k);
}
}
foo(&mut u, &mut v, 0, 0);
(u,v)
}
fn main() {
let (u,v) = func();
println!("{:?}", u);
println!("{:?}", v);
}
but I would prefer to do something like this:
fn func() -> (Vec<i32>, Vec<i32>) {
let mut u = vec![0;5];
let mut v = vec![0;5];
let foo = |i, j| {
for k in i+1..u.len() {
u[k] += 1;
bar(k, j);
}
};
let bar = |i, j| {
for k in j+1..v.len() {
v[k] += 1;
foo(i, k);
}
};
foo(0, 0);
(u,v)
}
fn main() {
let (u,v) = func();
println!("{:?}", u);
println!("{:?}", v);
}
The second example doesn't compile with the error: unresolved name bar.
In my task I can do it through one recursion, but it will not look clear.
Does anyone have any other suggestions?
I have a solution for mutually recursive closures, but it doesn't work with multiple mutable borrows, so I couldn't extend it to your example.
There is a way to use define mutually recursive closures, using an approach similar to how this answer does single recursion. You can put the closures together into a struct, where each of them takes a borrow of that struct as an extra argument.
fn func(n: u32) -> bool {
struct EvenOdd<'a> {
even: &'a Fn(u32, &EvenOdd<'a>) -> bool,
odd: &'a Fn(u32, &EvenOdd<'a>) -> bool
}
let evenodd = EvenOdd {
even: &|n, evenodd| {
if n == 0 {
true
} else {
(evenodd.odd)(n - 1, evenodd)
}
},
odd: &|n, evenodd| {
if n == 0 {
false
} else {
(evenodd.even)(n - 1, evenodd)
}
}
};
(evenodd.even)(n, &evenodd)
}
fn main() {
println!("{}", func(5));
println!("{}", func(6));
}
While defining mutually recursive closures works in some cases, as demonstrated in the answer by Alex Knauth, I don't think that's an approach you should usually take. It is kind of opaque, has some limitations pointed out in the other answer, and it also has a performance overhead since it uses trait objects and dynamic dispatch at runtime.
Closures in Rust can be thought of as functions with associated structs storing the data you closed over. So a more general solution is to define your own struct storing the data you want to close over, and define methods on that struct instead of closures. For this case, the code could look like this:
pub struct FooBar {
pub u: Vec<i32>,
pub v: Vec<i32>,
}
impl FooBar {
fn new(u: Vec<i32>, v: Vec<i32>) -> Self {
Self { u, v }
}
fn foo(&mut self, i: usize, j: usize) {
for k in i+1..self.u.len() {
self.u[k] += 1;
self.bar(k, j);
}
}
fn bar(&mut self, i: usize, j: usize) {
for k in j+1..self.v.len() {
self.v[k] += 1;
self.foo(i, k);
}
}
}
fn main() {
let mut x = FooBar::new(vec![0;5], vec![0;5]);
x.foo(0, 0);
println!("{:?}", x.u);
println!("{:?}", x.v);
}
(Playground)
While this can get slightly more verbose than closures, and requires a few more explicit type annotations, it's more flexible and easier to read, so I would generally prefer this approach.

Resources