How does one operate over a subset of a vector?

How does one operate over a subset of a vector? - rust

I understand how to operate on an entire vector, though I don't think this is idiomatic Rust:
fn median(v: &Vec<u32>) -> f32 {
let count = v.len();
if count % 2 == 1 {
v[count / 2] as f32
} else {
(v[count / 2] as f32 + v[count / 2 - 1] as f32) / 2.0
}
}
fn main() {
let mut v1 = vec![3, 7, 8, 5, 12, 14, 21, 13, 18];
v1.sort();
println!("{:.*}", 1, median(&v1));
}
But what if I want to operate on only half of this vector? For example, the first quartile is the median of the lower half, and the third quartile is the median of the upper half. My first thought was to construct two new vectors, but that did not seem quite right.
How do I get "half" a vector?

As mentioned, you want to create a slice using the Index trait with a Range:
let slice = &v1[0..v1.len() / 2];
This is yet another reason why it is discouraged to accept a &Vec. The current code would require converting the slice into an allocated Vec. Instead, rewrite it to accept a slice:
fn median(v: &[u32]) -> f32 {
// ...
}
Since you are likely interested in splitting a vector / slice in half and getting both parts, split_at may be relevant:
let (head, tail) = v1.split_at(v1.len() / 2);
println!("{:.*}", 1, median(head));
println!("{:.*}", 1, median(tail));

How to find the median on vector:
fn median(numbers: &mut Vec<i32>) -> i32 {
numbers.sort();
let mid = numbers.len() / 2;
if numbers.len() % 2 == 0 {
mean(&vec![numbers[mid - 1], numbers[mid]]) as i32
} else {
numbers[mid]
}
}
How to get half a vector:
Use Slice:
let slice: &[i32] = &numbers[0..numbers.len() / 2];
Creates a draining iterator
let half: Vec<i32> = numbers.drain(..numbers.len()/2).collect()

Related

How does one get an iterator to the max value element in Rust?

I want to access the element next to the maximal one in a Vec<i32>. I'm looking for something like this:
let v = vec![1, 3, 2];
let it = v.iter().max_element();
assert_eq!(Some(&2), it.next());
In C++, I would go with std::max_element and then just increase the iterator (with or without bounds checking, depending on how adventurous I feel at the moment). The Rust max only returns a reference to the element, which is not good enough for my use case.
The only solution I came up with is using enumerate to get the index of the item - but this seems manual and cumbersome when compared to the C++ way.
I would prefer something in the standard library.
This example is simplified - I actually want to attach to the highest value and then from that point loop over the whole container (possibly with cycle() or something similar).

C++ iterators are not the same as Rust iterators. Rust iterators are forward-only and can only be traversed once. C++ iterators can be thought of as cursors. See What are the main differences between a Rust Iterator and C++ Iterator? for more details.
In order to accomplish your goal in the most generic way possible, you have to walk through the entire iterator to find the maximum value. Along the way, you have to duplicate the iterator each time you find a new maximum value. At the end, you can return the iterator corresponding to the point after the maximum value.
trait MaxElement {
type Iter;
fn max_element(self) -> Self::Iter;
}
impl<I> MaxElement for I
where
I: Iterator + Clone,
I::Item: PartialOrd,
{
type Iter = Self;
fn max_element(mut self) -> Self::Iter {
let mut max_iter = self.clone();
let mut max_val = None;
while let Some(val) = self.next() {
if max_val.as_ref().map_or(true, |m| &val > m) {
max_iter = self.clone();
max_val = Some(val);
}
}
max_iter
}
}
fn main() {
let v = vec![1, 3, 2];
let mut it = v.iter().max_element();
assert_eq!(Some(&2), it.next());
}
See also:
How can I add new methods to Iterator?
I actually want to attach to the highest value and then from that point loop over the whole container (possibly with cycle() or something similar).
In that case, I'd attempt to be more obvious:
fn index_of_max(values: &[i32]) -> Option<usize> {
values
.iter()
.enumerate()
.max_by_key(|(_idx, &val)| val)
.map(|(idx, _val)| idx)
}
fn main() {
let v = vec![1, 3, 2];
let idx = index_of_max(&v).unwrap_or(0);
let (a, b) = v.split_at(idx);
let mut it = b.iter().chain(a).skip(1);
assert_eq!(Some(&2), it.next());
}
See also:
What's the fastest way of finding the index of the maximum value in an array?
Using max_by_key on a vector of floats
What is the idiomatic way to get the index of a maximum or minimum floating point value in a slice or Vec in Rust?
Find the item in an array with the largest property

a simple solution is to use fold,
the following code produces "largest num is: 99"
let vv:Vec<i32> = (1..100).collect();
let largest = vv.iter().fold(std::i32::MIN, |a,b| a.max(*b));
println!("largest {} ", largest);

If all you want is the value of the item following the maximum, I would do it with a simple call to fold, keeping track of the max found so far and the corresponding next value:
fn main() {
let v = vec![1, 3, 2];
let nxt = v.iter().fold (
(None, None),
|acc, x| {
match acc {
(Some (max), _) if x > max => (Some (x), None),
(Some (max), None) => (Some (max), Some (x)),
(None, _) => (Some (x), None),
_ => acc
}
}
).1;
assert_eq!(Some(&2), nxt);
}
playground
Depending on what you want to do with the items following the max, a similar approach may allow you to do it in a single pass.

Writing a function to take iterables of reference and value type

I'd like to have a function that takes an iterable and returns its smallest and largest elements. This is part of an exercise in learning Rust, but I'm struggling in being able to handle reference types and value types at the same time.
This is what I have:
fn min_max<'a, I, T>(mut iter: I) -> Option<(&'a T, &'a T)>
where
I: Iterator<Item = &'a T>,
T: PartialOrd,
{
let mut min = match iter.next() {
Some(x) => x,
// The collection is empty
None => return None,
};
let mut max = min;
for el in iter {
if el < min {
min = el;
}
if el >= max {
max = el;
}
}
Some((min, max))
}
Then, I give this an iterator over some integers.
let nums: [u32; 6] = [4, 3, 9, 10, 4, 3];
if let Some((min, max)) = min_max(nums.iter()) {
println!("{} {}", min, max);
}
This works, and prints 3 10. But then I want to do some operations on the numbers before I compute the minimum and maximum, like a map and/or a filter.
let doubled = nums.iter().map(|x| 2 * x);
if let Some((min, max)) = min_max(doubled) {
println!("{} {}", min, max);
}
This gives a compiler error:
error[E0271]: type mismatch resolving `<[closure#src/main.rs:31:35: 31:44] as std::ops::FnOnce<(&u32,)>>::Output == &_`
--> src/main.rs:32:31
|
32 | if let Some((min, max)) = min_max(doubled) {
| ^^^^^^^ expected u32, found reference
|
= note: expected type `u32`
found type `&_`
= note: required because of the requirements on the impl of `std::iter::Iterator` for `std::iter::Map<std::slice::Iter<'_, u32>, [closure#src/main.rs:31:35: 31:44]>`
= note: required by `min_max`
This confused me, because if nums.iter() works as an argument, why shouldn't nums.iter().map(...)?
I understand the error message in principle: my array is of u32, not &u32, whereas my function requires Iterator::Item to be of type &'a T. But then I don't get why it errors only on the second sample (using .iter().map()) and not on the first (just .iter()).
I've made a playground with this example and a commented out example where I construct an iterable of integers from a string. This fails in exactly the same way as the second example above (and is closer to my actual use case).
let s = "4 3 9 10 4 3";
let parsed = s.split(" ").map(|x| x.parse::<u32>().unwrap());
if let Some((min, max)) = min_max(parsed) {
println!("{} {}", min, max);
}

I'd like to have a function that takes an iterable and returns its smallest and largest elements.
Use Itertools::minmax.
handle reference types and value types at the same time.
You don't need to — references to numbers can also be compared:
fn foo(a: &i32, b: &i32) -> bool {
a < b
}
In your case, remember that a value and a reference to that value are different types. That means you can accept an iterator of any type so long as the yielded values are comparable, and this includes both references and values, as requested:
fn min_max<I>(mut iter: I) -> Option<(I::Item, I::Item)>
where
I: Iterator,
I::Item: Clone + PartialOrd,
{
let mut min = match iter.next() {
Some(x) => x,
// The collection is empty
None => return None,
};
let mut max = min.clone();
for el in iter {
if el < min {
min = el;
} else if el >= max {
max = el;
}
}
Some((min, max))
}
I chose to add the Clone bound although to be more true to your original I could have used the Copy bound. Itertools returns an enum to avoid placing any restrictions on being able to duplicate the value.
This works with all three of your examples:
fn main() {
let nums: [u32; 6] = [4, 3, 9, 10, 4, 3];
if let Some((min, max)) = min_max(nums.iter()) {
println!("{} {}", min, max);
}
let doubled = nums.iter().map(|x| 2 * x);
if let Some((min, max)) = min_max(doubled) {
println!("{} {}", min, max);
}
let s = "4 3 9 10 4 3";
let parsed = s.split(" ").map(|x| x.parse::<u32>().unwrap());
if let Some((min, max)) = min_max(parsed) {
println!("{} {}", min, max);
}
}
3 10
6 20
3 10
my array is of u32, not &u32, whereas my function requires Iterator::Item to be of type &'a T. But then I don't get why it errors only on the second sample (using .iter().map()) and not on the first (just .iter()).
Because iterating over an array returns references. By using map, you are changing the type of the iterator's item from &i32 to i32. You could have also chosen to adapt the first call to return values.

You have a type mismatch problem because the .iter() call produces a "slice" iterator (Iterator with Item = &T), but the .map(|x| 2 * x) is a iterator adaptor, the call of which produces a new "value" iterator (Iterator with Item = T). These values must necessarily be stored in memory before we can get them "slice", because we can only get a reference to the value that is already stored somewhere in the memory. Therefore, we need to collect the result of the map function before we can get an iterator with references to the values it returns:
let doubled: Vec<_> = nums.iter().map(|x| 2 * x).collect();
if let Some((min, max)) = min_max(doubled.iter()) {
println!("{} {}", min, max);
}
For more details, see chapter 13.2 Iterators of The Rust Programming Language book.

What type signature to use for an iterator generated from a slice?

I have this toy example, but it's what I'm trying to accomplish:
fn lazy_vec() {
let vec: Vec<i64> = vec![1, 2, 3, 4, 5];
let mut iter: Box<Iterator<Item = i64>> = Box::new(vec.into_iter());
iter = Box::new(iter.map(|x| x + 1));
// potentially do additional similar transformations to iter
println!("{:?}", iter.collect::<Vec<_>>());
}
This (if I'm not mistaken) is a lazy iterator pattern, and the actual map operation doesn't occur until .collect() is called. I want to do the same thing with slices:
fn lazy_slice() {
let vec: Vec<i64> = vec![1, 2, 3, 4, 5];
let slice: &[i64] = &vec[..3];
let mut iter: Box<Iterator<Item = i64>> = Box::new(slice.into_iter());
iter = Box::new(iter.map(|x| x + 1));
// potentially do additional similar transformations to iter
println!("{:?}", iter.collect::<Vec<_>>());
}
This results in a type mismatch:
error[E0271]: type mismatch resolving `<std::slice::Iter<'_, i64> as std::iter::Iterator>::Item == i64`
--> src/main.rs:4:47
|
4 | let mut iter: Box<Iterator<Item = i64>> = Box::new(slice.into_iter());
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected reference, found i64
|
= note: expected type `&i64`
found type `i64`
= note: required for the cast to the object type `std::iter::Iterator<Item=i64>`
I can't figure out what I need to do to resolve this error. The second note made me think I needed:
iter = Box::new(iter.map(|x| x + 1) as Iterator<Item = i64>);
or
iter = Box::new(iter.map(|x| x + 1)) as Box<Iterator<Item = i64>>;
These fail with other errors depending on the exact syntax (e.g. expected reference, found i64, or expected i64, found &i64). I've tried other ways to declare the types involved, but I'm basically just blindly adding & and * in places and not making any progress.
What am I missing here? What do I need to change in order to make this compile?
Edit
Here's a slightly more concrete example - I need iter to be mut so that I can compose an unknown number of such transformations before actually invoking .collect(). My impression was this was a somewhat common pattern, apologies if that wasn't correct.
fn lazy_vec(n: i64) {
let vec: Vec<i64> = vec![1, 2, 3, 4, 5];
let mut iter: Box<Iterator<Item = i64>> = Box::new(vec.into_iter());
for _ in 0..n {
iter = Box::new(iter.map(|x| x + 1));
}
println!("{:?}", iter.collect::<Vec<_>>());
}
I'm aware I could rewrite this specific task in a simpler way (e.g. a single map that adds n to each element) - it's an oversimplified MCVE of the problem I'm running into. My issue is this works for lazy_vec, but I'm not sure how to do the same with slices.
Edit 2
I'm just learning Rust and some of the nomenclature and concepts are new to me. Here's what I'm envisioning doing in Python, for comparison. My intent is to do the same thing with slices that I can currently do with vectors.
#!/usr/bin/env python3
import itertools
ls = [i for i in range(10)]
def lazy_work(input):
for i in range(10):
input = (i + 1 for i in input)
# at this point no actual work has been done
return input
print("From list: %s" % list(lazy_work(ls)))
print("From slice: %s" % list(lazy_work(itertools.islice(ls, 5))))
Obviously in Python there's no issues with typing, but hopefully that more clearly demonstrates my intent?

As discussed in What is the difference between iter and into_iter?, these methods create iterators which yield different types when called on a Vec compared to a slice.
[T]::iter and [T]::into_iter both return an iterator which yields values of type &T. That means that the returned value doesn't implement Iterator<Item = i64> but instead Iterator<Item = &i64>, as the error message states.
However, your subsequent map statements change the type of the iterator's item to an i64, which means the type of the iterator would also need to change. As an analogy, you've essentially attempted this:
let mut a: &i64 = &42;
a = 99;
Iterator::cloned exists to make clones of the iterated value. In this case, it converts a &i64 to an i64 essentially dereferencing the value:
fn lazy_slice(n: i64) {
let array = [1i64, 2, 3, 4, 5];
let mut iter: Box<Iterator<Item = i64>> = Box::new(array.iter().cloned());
for _ in 0..n {
iter = Box::new(iter.map(|x| x + 1));
}
println!("{:?}", iter.collect::<Vec<_>>());
}

How to set a range in a Vec or slice?

My end goal is to shuffle the rows of a matrix (for which I am using nalgebra).
To address this I need to set a mutable range (slice) of an array.
Supposing I have an array as such (let's say it's a 3x3 matrix):
let mut scores = [7, 8, 9, 10, 11, 12, 13, 14, 15];
I have extracted a row like this:
let r = &scores[..].chunks(3).collect::<Vec<_>>()[1];
Now, for the knuth shuffle I need to swap this with another row. What I need to do is:
scores.chunks_mut(3)[0] = r;
however this fails as such:
cannot index a value of type `core::slice::ChunksMut<'_, _>`
Example: http://is.gd/ULkN6j

I ended up doing a loop over and an element by element swap which seems like a cleaner implementation to me:
fn swap_row<T>(matrix: &mut [T], row_src: usize, row_dest: usize, cols: usize){
for c in 0..cols {
matrix.swap(cols * row_src + c, cols * row_dest + c);
}
}

Your code, as you'd like to write it, can never work. You have an array that you are trying to read from and write to at the same time. This will cause you to have duplicated data:
[1, 2, 3, 4]
// Copy last two to first two
[3, 4, 3, 4]
// Copy first two to last two
[3, 4, 3, 4]
Rust will prevent you from having mutable and immutable references to the same thing for this very reason.
cannot index a value of type core::slice::ChunksMut<'_, _>
chunks_mut returns an iterator. The only thing that an iterator is guaranteed to do is return "the next thing". You cannot index it, it is not all available in contiguous memory.
To move things around, you are going to need somewhere temporary to store the data. One way is to copy the array:
let scores = [7, 8, 9, 10, 11, 12, 13, 14, 15];
let mut new_scores = scores;
for (old, new) in scores[0..3].iter().zip(new_scores[6..9].iter_mut()) {
*new = *old;
}
for (old, new) in scores[3..6].iter().zip(new_scores[0..3].iter_mut()) {
*new = *old;
}
for (old, new) in scores[6..9].iter().zip(new_scores[3..6].iter_mut()) {
*new = *old;
}
Then it's a matter of following one of these existing questions to copy from one to the other.

that's probably closer to what You wanted to do:
fn swap_row<T: Clone>(matrix: &mut [T], row_src: usize, row_dest: usize, cols: usize) {
let v = matrix[..].to_vec();
let mut chunks = v.chunks(cols).collect::<Vec<&[T]>>();
chunks.swap(row_src, row_dest);
matrix.clone_from_slice(chunks.into_iter().fold((&[]).to_vec(), |c1, c2| [c1, c2.to_vec()].concat()).as_slice());
}
I would prefer:
fn swap_row<T: Clone>(matrix: &[T], row_src: usize, row_dest: usize, cols: usize) -> Vec<T> {
let mut chunks = matrix[..].chunks(cols).collect::<Vec<&[T]>>();
chunks.swap(row_src, row_dest);
chunks.iter().fold((&[]).to_vec(), |c1, c2| [c1, c2.to_vec()].concat())
}
btw: nalgebra provides unsafe fn as_slice_unchecked(&self) -> &[T] for all kinds of Storage and RawStorage.
Shuffeling this slice avoids the need for row swapping.

How to sum the values in an array, slice, or Vec in Rust?

Editor's note: This question's example is from a version of Rust prior to 1.0 and references types and methods no longer found in Rust. The answers still contain valuable information.
The following code
let mut numbers = new_serial.as_bytes().iter().map(|&x| (x - 48));
let sum = numbers.sum();
results in the following error:
std::iter::Map<,&u8,u8,std::slice::Items<,u8>>` does not implement any method in scope named `sum`
What must I do to sum an array of bytes?
The following works:
for byte in new_serial.as_bytes().iter() {
sum = sum + (byte - 48);
}

Iterator::sum was stabilized in Rust 1.11.0. You can get an iterator from your array/slice/Vec and then use sum:
fn main() {
let a = [1, 2, 3, 4, 5];
let sum: u8 = a.iter().sum();
println!("the total sum is: {}", sum);
}
Of special note is that you need to specify the type to sum into (sum: u8) as the method allows for multiple implementations. See Why can't Rust infer the resulting type of Iterator::sum? for more information.
Applied to your original example:
let new_serial = "01234";
let sum: u8 = new_serial.as_bytes().iter().map(|&x| x - 48).sum();
println!("{}", sum);
As an aside, it's likely more clear if you use b'0' instead of 48.

If performance is important, consider using an implementation that helps the compiler at producing SIMD instructions.
For example, for f32, using 16 lanes (total of 512 bits):
use std::convert::TryInto;
const LANES: usize = 16;
pub fn nonsimd_sum(values: &[f32]) -> f32 {
let chunks = values.chunks_exact(LANES);
let remainder = chunks.remainder();
let sum = chunks.fold([0.0f32; LANES], |mut acc, chunk| {
let chunk: [f32; LANES] = chunk.try_into().unwrap();
for i in 0..LANES {
acc[i] += chunk[i];
}
acc
});
let remainder: f32 = remainder.iter().copied().sum();
let mut reduced = 0.0f32;
for i in 0..LANES {
reduced += sum[i];
}
reduced + remainder
}
pub fn naive_sum(values: &[f32]) -> f32 {
values.iter().sum()
}
for
let values = (0..513).map(|x| x as f32).collect::<Vec<_>>();
the above is 10x faster than values.iter().sum() on my computer:
nonsimd_sum time: [77.341 ns 77.773 ns 78.378 ns]
naive_sum time: [739.97 ns 740.48 ns 740.97 ns]
and ~10% slower than using packed_simd2 (but it does not require nightly).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How does one operate over a subset of a vector? - rust

Related

How does one get an iterator to the max value element in Rust?

Writing a function to take iterables of reference and value type

What type signature to use for an iterator generated from a slice?

How to set a range in a Vec or slice?

How to sum the values in an array, slice, or Vec in Rust?

Categories

Resources