I have a function that operates on a Vec<T> which purpose is to extend the vector with new items generated using reference to existing items. I'm trying to run the generation of new data in parallel using rayon.
This is a minimal example:
use itertools::Itertools;
use rayon::prelude::*;
fn main() {
let mut foo = Foo {
data: (0..1000).into_iter().collect(),
};
foo.run();
}
struct Foo<T> {
data: Vec<T>,
}
type Pair<'a, T> = (&'a T, &'a T);
impl<'a, T: Clone + 'a> Foo<T>
where
Vec<Pair<'a, T>>: IntoParallelIterator<Item = Pair<'a, T>>,
[T; 2]: IntoParallelIterator,
Vec<T>: FromParallelIterator<<[T; 2] as IntoParallelIterator>::Item>,
{
fn run(&'a mut self) {
let combinations: Vec<Pair<'a, T>> = self
.data
.iter()
.combinations(2)
.map(|x| (x[0], x[1]))
.collect();
let mut new_combinations: Vec<T> = combinations
.into_par_iter()
.flat_map(|(a, b)| bar(a, b))
.collect();
self.data.append(&mut new_combinations);
}
}
fn bar<T: Clone>(a: &T, b: &T) -> [T; 2] {
[a.clone(), b.clone()]
}
You can find a link to Playground here.
Building the above example raises this error:
error[E0502]: cannot borrow `self.data` as mutable because it is also borrowed as immutable
--> src/main.rs:36:9
|
17 | impl<'a, T: Clone + 'a> Foo<T>
| -- lifetime `'a` defined here
...
24 | let combinations: Vec<Pair<'a, T>> = self
| ___________________________----------------___-
| | |
| | type annotation requires that `self.data` is borrowed for `'a`
25 | | .data
26 | | .iter()
| |___________________- immutable borrow occurs here
...
36 | self.data.append(&mut new_combinations);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ mutable borrow occurs here
As far as I understand since I am collecting into a let new_combinations: Vec<T> there should be no immutable references to self.data and I should be able in theory to borrow it mutably to append the new combinations. However, it seems that self.data is borrowed for 'a which extends beyond the scope of this method. I cannot find a way to avoid specifying the lifetime fn run(&'a mut self) since I need to specify that the lifetimes of the references to the items of self.data cannot outlive self when creating the combinations.
Is there a way to allow this method to operate as expected, that is: 1) select a list of references to the items in self.data, 2) apply a function that creates new items T in parallel and finally 3) update self.data with the new items.
Note that as a workaround one could return the new_combinations from the method and append them to self.data separately.
Would be great if all of this would be possible by avoiding as many collect() as possible while operating directly with iterators only.
The elements in new_combinations are cloned and therefore don't borrow from combinations any more. Your annotations, however, state that T: 'a, meaning Rust has to treat them as still borrowed.
I personally think you are being way too excessive with the lifetimes annotations here, you could remove almost all of them. The compiler is very good in figuring them out automatically in most situations.
Further, your trait restrictions were sadly mislead by the compiler hints. They are all automatically fulfilled once you specify T: Clone + Send + Sync.
Here you go:
use itertools::Itertools;
use rayon::prelude::*;
fn main() {
let mut foo = Foo {
data: (0..10).collect(),
};
foo.run();
println!("{:?}", foo.data);
}
struct Foo<T> {
data: Vec<T>,
}
type Pair<'a, T> = (&'a T, &'a T);
impl<T: Clone + Send + Sync> Foo<T> {
fn run(&mut self) {
let combinations: Vec<Pair<T>> = self
.data
.iter()
.combinations(2)
.map(|x| (x[0], x[1]))
.collect();
let mut new_combinations: Vec<T> = combinations
.into_par_iter()
.flat_map(|(a, b)| bar(a, b))
.collect();
self.data.append(&mut new_combinations);
}
}
fn bar<T: Clone>(a: &T, b: &T) -> [T; 2] {
[a.clone(), b.clone()]
}
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, 9, 1, 2, 1, 3, 1, 4, 1, 5, 1, 6, 1, 7, 1, 8, 1, 9, 2, 3, 2, 4, 2, 5, 2, 6, 2, 7, 2, 8, 2, 9, 3, 4, 3, 5, 3, 6, 3, 7, 3, 8, 3, 9, 4, 5, 4, 6, 4, 7, 4, 8, 4, 9, 5, 6, 5, 7, 5, 8, 5, 9, 6, 7, 6, 8, 6, 9, 7, 8, 7, 9, 8, 9]
Further, there is really no need for the Pair type:
use itertools::Itertools;
use rayon::prelude::*;
fn main() {
let mut foo = Foo {
data: (0..10).collect(),
};
foo.run();
println!("{:?}", foo.data);
}
struct Foo<T> {
data: Vec<T>,
}
impl<T: Clone + Send + Sync> Foo<T> {
fn run(&mut self) {
let combinations: Vec<_> = self
.data
.iter()
.combinations(2)
.map(|x| (x[0], x[1]))
.collect();
let mut new_combinations: Vec<T> = combinations
.into_par_iter()
.flat_map(|(a, b)| bar(a, b))
.collect();
self.data.append(&mut new_combinations);
}
}
fn bar<T: Clone>(a: &T, b: &T) -> [T; 2] {
[a.clone(), b.clone()]
}
About the last part of your question, the request to remove all .collect() calls if possible:
Sadly, I don't think you will be able to remove any of the collect()s. At least not with your current code layout. You definitely need to collect() between combinations() and into_par_iter(), and you also definitely need to collect() before append, because you need to release all references to self.data before you write into it.
Related
This question already has answers here:
Iterator lifetime issue when returning references to inner collection
(2 answers)
Why is adding a lifetime to a trait with the plus operator (Iterator<Item = &Foo> + 'a) needed?
(1 answer)
Closed 7 months ago.
I'm trying to mapped index into a slice given a transformation function, returning an iterator, here the example code:
pub trait MappedIdxIter<T> {
fn map_idx(&self, transform_func: fn(idx: usize, max: usize) -> Option<usize>,) -> Box<dyn Iterator<Item = T>>;
}
impl<'a, U: 'a> MappedIdxIter<&'a U> for [U] {
fn map_idx(&self, transform_func: fn(idx: usize, max: usize) -> Option<usize>,) -> Box<dyn Iterator<Item = &'a U>> {
let max = self.len();
let iter = (0..max).into_iter()
.filter_map(|i| transform_func(i, max))
.map(|i| &self[i]);
Box::new(iter)
}
}
#[cfg(test)]
mod test {
use super::*;
#[test]
fn should_return_even_idx_values() {
let list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9];
let mapped: Vec<&u32> = list.map_idx(|i, m|
if (i * 2) < m { Some(i * 2) } else { None }
).collect();
assert_eq!([&0, &2, &4, &6, &8], mapped[..]);
}
#[test]
fn should_return_1_5_3() {
let list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9];
let mapped: Vec<&u32> = list
.map_idx(|i, _m| match i {
0 => Some(1),
1 => Some(5),
2 => Some(3),
_ => None,
})
.collect();
assert_eq!([&1, &5, &3], mapped[..]);
}
}
Rust playground
I'm getting the following error
error[E0495]: cannot infer an appropriate lifetime for lifetime parameter in function call due to conflicting requirements
--> src/priority_list.rs:107:27
|
107 | .map(|i| &self[i]);
| ^^^^^^^
|
How can tell rust that the iterator is only valid for the lifetime of self?
There is some pointers here https://users.rust-lang.org/t/how-to-write-trait-which-returns-an-iterator/44828 but I failed to make it work.
EDIT:
I've been able to implement this using awesome nougat crate, see comment
I'm trying to iterate over a slice broken into chunks, and return a tuple with the nth element of each chunk.
Example:
&[1,2,3,4,5,6,7,8,9]
I'd like to break this into chunks of size 3, and then iterate over the results, returning these tuples, one on each next() call:
&mut[1,4,7], &mut[2,5,8], &mut[3,6,9]
I know that for general stuff it isn't possible to return mutable stuff, mut this is clearly disjoint, and without unsafe code we can have the ChunksMut (https://doc.rust-lang.org/std/slice/struct.ChunksMut.html) iterator, so maybe there's a way!. For example, I can have 3 ChunksMut and then the compiler knows that the elements returned from them are disjoint.
This is my try for non mutable:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=cfa7ca0bacbe6f1535050cd7dd5c537c
PS: I want to avoid Vec or any allocation on each iteration
so I always return a reference to its internal slice
The Iterator trait doesn't support this, because its contract allows the caller to extract several values and use all of them. For example, the following is permitted by Iterator but wouldn't be supported by your implementation:
// take two values out of the iterator
let a = it.next().unwrap();
let b = it.next().unwrap();
What you need is a "lending iterator" (also known as "streaming iterator"), see e.g. this crate. Writing lending iterators will become much easier once GATs are stabilized, but they still won't be supported by std::iter::Iterator.
Using the standard Iterator you can avoid allocation by using ArrayVec or equivalent replacement for Vec, as suggested by #Stargateur.
I'm pretty sure you wanted to get mutable references into the original slice using the iterator, resulting in &mut [&mut 1, &mut 4, &mut 7], &mut [&mut 2, &mut 5, &mut 8], &mut [&mut 3, &mut 6, &mut 9].
Without allocation / unsafe / external crates. Requires rust version 1.55 or greater:
fn iter_chunks<T, const CHUNK_SIZE: usize>(
slice: &mut [T],
) -> impl Iterator<Item = [&mut T; CHUNK_SIZE]> + '_ {
assert_eq!(slice.len() % CHUNK_SIZE, 0);
let len = slice.len();
let mut a: [_; CHUNK_SIZE] = array_collect(
slice
.chunks_mut(len / CHUNK_SIZE)
.map(|iter| iter.iter_mut()),
);
(0..len / CHUNK_SIZE).map(move |_| array_collect(a.iter_mut().map(|i| i.next().unwrap())))
}
/// Builds an array from the first `N` items of an iterator
///
/// Panics:
///
/// If there are less then `N` items in the iterator
fn array_collect<T, const N: usize>(mut iter: impl Iterator<Item = T>) -> [T; N] {
let a: [(); N] = [(); N];
a.map(|_| iter.next().unwrap())
}
Without allocation, using an external crate:
We need to use arrayvec since Rust's array cannot be used with collect.
use arrayvec::ArrayVec;
fn main() {
let slice = &mut [1, 2, 3, 4, 5, 6, 7, 8, 9];
for (i, chunk) in iter_chunks::<_, 3>(slice).enumerate() {
println!("{:?}", chunk);
for t in chunk {
*t = i;
}
}
println!("slice: {:?}", slice);
}
fn iter_chunks<T, const CHUNK_SIZE: usize>(
slice: &mut [T],
) -> impl Iterator<Item = ArrayVec<&mut T, CHUNK_SIZE>> + '_ {
let len = slice.len();
let mut a: ArrayVec<_, CHUNK_SIZE> = slice
.chunks_mut(len / CHUNK_SIZE)
.map(|chunk| chunk.iter_mut())
.collect();
(0..len / CHUNK_SIZE).map(move |_| {
a.iter_mut()
.map(|iter| iter.next().unwrap())
.collect::<ArrayVec<_, CHUNK_SIZE>>()
})
}
Output:
[1, 4, 7]
[2, 5, 8]
[3, 6, 9]
slice: [0, 1, 2, 0, 1, 2, 0, 1, 2]
I want to iterate over clients in a Vec and process each using a method that is supposed to take all the other clients as an argument as well.
There's no such iterator that I'm aware of, but it's not complicated to create your own:
struct X<'a, T: 'a> {
item: &'a T,
before: &'a [T],
after: &'a [T],
}
struct AllButOne<'a, T: 'a> {
slice: &'a [T],
index: usize,
}
impl<'a, T> AllButOne<'a, T> {
fn new(slice: &'a [T]) -> Self {
AllButOne { slice, index: 0 }
}
}
impl<'a, T> Iterator for AllButOne<'a, T> {
type Item = X<'a, T>;
fn next(&mut self) -> Option<Self::Item> {
if self.index > self.slice.len() {
return None;
}
let (before, middle) = self.slice.split_at(self.index);
let (middle, after) = middle.split_at(1);
self.index += 1;
Some(X {
before,
after,
item: &middle[0],
})
}
}
fn main() {
let a = [1, 2, 3, 4];
for x in AllButOne::new(&a) {
println!("{:?}, {}, {:?}", x.before, x.item, x.after);
}
}
[], 1, [2, 3, 4]
[1], 2, [3, 4]
[1, 2], 3, [4]
[1, 2, 3], 4, []
This returns two slices, one for all the values before the current item and one for after. You can perform allocation and stick them into the same collection if you need.
I have an array of an unknown size, and I would like to get a slice of that array and convert it to a statically sized array:
fn pop(barry: &[u8]) -> [u8; 3] {
barry[0..3] // expected array `[u8; 3]`, found slice `[u8]`
}
How would I do this?
You can easily do this with the TryInto trait (which was stabilized in Rust 1.34):
// Before Rust 2021, you need to import the trait:
// use std::convert::TryInto;
fn pop(barry: &[u8]) -> [u8; 3] {
barry.try_into().expect("slice with incorrect length")
}
But even better: there is no need to clone/copy your elements! It is actually possible to get a &[u8; 3] from a &[u8]:
fn pop(barry: &[u8]) -> &[u8; 3] {
barry.try_into().expect("slice with incorrect length")
}
As mentioned in the other answers, you probably don't want to panic if the length of barry is not 3, but instead handle this error gracefully.
This works thanks to these impls of the related trait TryFrom (before Rust 1.47, these only existed for arrays up to length 32):
impl<'_, T, const N: usize> TryFrom<&'_ [T]> for [T; N]
where
T: Copy,
impl<'a, T, const N: usize> TryFrom<&'a [T]> for &'a [T; N]
impl<'a, T, const N: usize> TryFrom<&'a mut [T]> for &'a mut [T; N]
Thanks to #malbarbo we can use this helper function:
use std::convert::AsMut;
fn clone_into_array<A, T>(slice: &[T]) -> A
where
A: Default + AsMut<[T]>,
T: Clone,
{
let mut a = A::default();
<A as AsMut<[T]>>::as_mut(&mut a).clone_from_slice(slice);
a
}
to get a much neater syntax:
fn main() {
let original = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
let e = Example {
a: clone_into_array(&original[0..4]),
b: clone_into_array(&original[4..10]),
};
println!("{:?}", e);
}
as long as T: Default + Clone.
If you know your type implements Copy, you can use this form:
use std::convert::AsMut;
fn copy_into_array<A, T>(slice: &[T]) -> A
where
A: Default + AsMut<[T]>,
T: Copy,
{
let mut a = A::default();
<A as AsMut<[T]>>::as_mut(&mut a).copy_from_slice(slice);
a
}
Both variants will panic! if the target array and the passed-in slice do not have the same length.
I recommend using the crate arrayref, which has a handy macro for doing just this.
Note that, using this crate, you create a reference to an array, &[u8; 3], because it doesn't clone any data!
If you do want to clone the data, then you can still use the macro, but call clone at the end:
#[macro_use]
extern crate arrayref;
fn pop(barry: &[u8]) -> &[u8; 3] {
array_ref!(barry, 0, 3)
}
or
#[macro_use]
extern crate arrayref;
fn pop(barry: &[u8]) -> [u8; 3] {
array_ref!(barry, 0, 3).clone()
}
You can manually create the array and return it.
Here is a function that can easily scale if you want to get more (or less) than 3 elements.
Note that if the slice is too small, the end terms of the array will be 0's.
fn pop(barry: &[u8]) -> [u8; 3] {
let mut array = [0u8; 3];
for (&x, p) in barry.iter().zip(array.iter_mut()) {
*p = x;
}
array
}
Here's a function that matches the type signature you asked for.
fn pop(barry: &[u8]) -> [u8; 3] {
[barry[0], barry[1], barry[2]]
}
But since barry could have fewer than three elements, you may want to return an Option<[u8; 3]> rather than a [u8; 3].
fn pop(barry: &[u8]) -> Option<[u8; 3]> {
if barry.len() < 3 {
None
} else {
Some([barry[0], barry[1], barry[2]])
}
}
I was unhappy with other answers because I needed several functions that return varying length fixed u8 arrays. I wrote a macro that produces functions specific for the task. Hope it helps someone.
#[macro_export]
macro_rules! vec_arr_func {
($name:ident, $type:ty, $size:expr) => {
pub fn $name(data: std::vec::Vec<$type>) -> [$type; $size] {
let mut arr = [0; $size];
arr.copy_from_slice(&data[0..$size]);
arr
}
};
}
//usage - pass in a name for the fn, type of array, length
vec_arr_func!(v32, u8, 32);
v32(data); //where data is std::vec::Vec<u8>
The nice common thing between Vec, 'Slice' and Array is Iter, so you can zip and map both together, as simple as:
let x = vec![1, 2, 3];
let mut y: [u8; 3] = [Default::default(); 3];
println!("y at startup: {:?}", y);
x.iter().zip(y.iter_mut()).map(|(&x, y)| *y = x).count();
println!("y copied from vec: {:?}", y);
This is as the array is 1 dimensional array.
To test all together, vec, slice and array, here you go:
let a = [1, 2, 3, 4, 5];
let slice = &a[1..4];
let mut x: Vec<u8> = vec![Default::default(); 3];
println!("X at startup: {:?}", x);
slice.iter().zip(x.iter_mut()).map(|(&s, x)| *x = s).count();
println!("X copied from vec: {:?}", x);
Another option which should be faster than byte-by-byte copy is:
y[..x.len()].copy_from_slice(&x);
Which is applicable for all, below is example:
let a = [1, 2, 3, 4, 5];
let mut b: Vec<u8> = vec![Default::default(); 5];
b[..a.len()].copy_from_slice(&a);
println!("Copy array a into vector b: {:?}", b);
let x: Vec<u8> = vec![1, 2, 3, 4, 5];
let mut y: [u8; 5] = [Default::default(); 5];
y[..x.len()].copy_from_slice(&x);
println!("Copy vector x into array y: {:?}", y);
This question already has answers here:
How to get a slice as an array in Rust?
(7 answers)
Closed 6 years ago.
I have a structure with some fixed-sized arrays:
struct PublicHeaderBlock_LAS14 {
file_signature: [u8; 4],
file_source_id: u16,
global_encoding: u16,
project_id_data_1: u32,
project_id_data_2: u16,
project_id_data_3: u16,
project_id_data_4: [u8; 8],
version_major: u8,
version_minor: u8,
systemIdentifier: [u8; 32], // ...
}
I'm reading in bytes from a file into a fixed size array and am copying those bytes into the struct bit by bit.
fn create_header_struct_las14(&self, buff: &[u8; 373]) -> PublicHeaderBlock_LAS14 {
PublicHeaderBlock_LAS14 {
file_signature: [buff[0], buff[1], buff[2], buff[3]],
file_source_id: (buff[4] | buff[5] << 7) as u16,
global_encoding: (buff[6] | buff[7] << 7) as u16,
project_id_data_1: (buff[8] | buff[9] << 7 | buff[10] << 7 | buff[11] << 7) as u32,
project_id_data_2: (buff[12] | buff[13] << 7) as u16,
project_id_data_3: (buff[14] | buff[15] << 7) as u16,
project_id_data_4: [buff[16], buff[17], buff[18], buff[19], buff[20], buff[21], buff[22], buff[23]],
version_major: buff[24],
version_minor: buff[25],
systemIdentifier: buff[26..58]
}
}
The last line (systemIdentifier) doesn't work, because in the struct it is a [u8; 32] and buff[26..58] is a slice. Can I return convert a slice to a fixed sized array like that over a range, instead of doing what I've done to say file_signature?
Edit: Since Rust 1.34, you can use TryInto, which is derived from TryFrom<&[T]> for [T; N]
struct Foo {
arr: [u8; 32],
}
fn fill(s: &[u8; 373]) -> Foo {
// We unwrap here because it will always return `Ok` variant
let arr: [u8; 32] = s[26..68].try_into().unwrap();
Foo { arr }
}
Original answer from 2016:
There is no safe way to initialize an array in a struct with a slice. You need either resort to unsafe block that operates directly on uninitialized memory, or use one of the following two initialize-then-mutate strategies:
Construct an desired array, then use it to initialize the struct.
struct Foo {
arr: [u8; 32],
}
fn fill(s: &[u8; 373]) -> Foo {
let mut a: [u8; 32] = Default::default();
a.copy_from_slice(&s[26..58]);
Foo { arr: a }
}
Or initialize the struct, then mutate the array inside the struct.
#[derive(Default)]
struct Foo {
arr: [u8; 32],
}
fn fill(s: &[u8; 373]) -> Foo {
let mut f: Foo = Default::default();
f.arr.copy_from_slice(&s[26..58]);
f
}
The first one is cleaner if your struct has many members. The second one may be a little faster if the compiler cannot optimize out the intermediate copy. But you probably will use the unsafe method if this is the performance bottleneck of your program.
Thanks to #malbarbo we can use this helper function:
use std::convert::AsMut;
fn clone_into_array<A, T>(slice: &[T]) -> A
where A: Sized + Default + AsMut<[T]>,
T: Clone
{
let mut a = Default::default();
<A as AsMut<[T]>>::as_mut(&mut a).clone_from_slice(slice);
a
}
to get a much neater syntax:
fn main() {
let original = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
let e = Example {
a: clone_into_array(&original[0..4]),
b: clone_into_array(&original[4..10]),
};
println!("{:?}", e);
}
as long as T: Default + Clone.
It will panic! if the target array and the passed-in slice do not have the same length, because clone_from_slice does.