Chain Vector and IntoIterator element in Rust - rust

I want to write a function in Rust that will return the vector composed of start integer, then all intermediate integers and then end integer. The assertion it should hold is this:
assert_eq!(intervals(0, 4, 1..4), vec![0, 1, 2, 3, 4]);
The hint is to use chain method for iterators. The function declaration is predefined, I implemented it in one way, which is the following code:
pub fn intervals< I>(start: u32, end: u32, intermediate: I) -> Vec<u32>
where
I: IntoIterator<Item = u32>,
{
let mut a1 = vec![];
a1.push(start);
let inter: Vec<u32> = intermediate.into_iter().collect();
let mut iter : Vec<u32> = a1.iter().chain(inter.iter()).map(|x| *x).collect();
iter.push(end);
return iter;
}
But I am quite convinced this is not really optimal way to do this. I am sure I am doing lots of unnecessary things in the middle two lines. I tried to use intermediate directly like this:
let mut iter: Vec<u32> = a1.iter().chain(intermediate).map(|x| *x).collect();
But I am getting this error for chain method and I don't know how to solve it:
type mismatch resolving <I as std::iter::IntoIterator>::Item==&u32,
expected u32, found &u32
I am super new in Rust so any advice would be helpful to understand what's the right way to use intermediate parameter here.

Here are a few hints:
You have created three separate vectors (one explicitly, two using collect) when in fact you only need one.
You can use the std::iter::once iterator to produce iterators for the start and end integers
No need to collect the intermediate range. The intermediate argument implements IntoIterator, so you can feed it directly to chain. So, you can chain together the start, intermediate and end.
No need to use the 'return' keyword at the end of a function - the result of a function is the value of the last expression in it (as long as there is no semicolon on the end).
Applying those tips your function would look like this:
use std::iter::once;
pub fn intervals< I>(start: u32, end: u32, intermediate: I) -> Vec<u32>
where
I: IntoIterator<Item = u32>,
{
once(start).chain(intermediate).chain(once(end)).collect()
}
One additional thing to note, to answer your question from the comments:
why trying this: a1.iter().chain(intermediate) gives an error with chain method
Calling Vec::iter() returns an iterator that returns references to the values in the vector. This makes sense: calling iter() does not consume the vector, and its contents remain intact: you could iterate over it multiple times if you wanted.
On the other hand, invoking into_iter() from the IntoIterator trait returns an iterator that returns the values. This also makes sense: into_iter() does consume the object you are calling it on, so the iterator then takes ownership of the items that were previously owned by the object.
Trying to chain together two such iterators does not work because they are each iterating different types. One resolution would be to consume a1 as well, like this:
let mut iter : Vec<u32> = a1.into_iter().chain(intermediate).collect();

Related

the trait SliceIndex<[i32]> is not implemented for [RangeFrom<{integer}>; 1]

I have written a function which takes a generic parameter T with bound AsRef[i32].
Now I want to slice the input further inside my function with get method. But rust compiler would not let me use 1.. range to slice. I can use split_at method to split the slice. That will work. But my question is why can't I use array.as_ref().get([1..]) in this case? Do I need to add any other trait bounds to the generic type to make it work? If I do get with one index like array.as_ref().get(0) that works fine.
Here is my code -
fn find<T>(array: T, key: i32) -> Option<usize>
where
T: AsRef<[i32]>,
{
let arr = array.as_ref().get([1..]);
println!("slicing successful");
None
}
fn main() {
let arr = [1, 2, 3];
find(arr, 1);
}
Playground link.
You are confusing two syntax. The first one is the most commonly used to index a slice:
let arr = array.as_ref()[1..];
This is just syntax sugar for
let arr = array.as_ref().index(1..);
Note that for the second version to work, you need to have the std::ops::Index trait in scope.
This will not work as is because it returns a slice [i32], and [i32]: !Sized. Therefore you need to add a level of indirection:
let arr = &array.as_ref()[1..];
See the playground.
The second possible way is to use the get method of slices:
let arr = array.as_ref().get(1..);
See the playground.

Proper signature for a function accepting an iterator of strings

I'm confused about the proper type to use for an iterator yielding string slices.
fn print_strings<'a>(seq: impl IntoIterator<Item = &'a str>) {
for s in seq {
println!("- {}", s);
}
}
fn main() {
let arr: [&str; 3] = ["a", "b", "c"];
let vec: Vec<&str> = vec!["a", "b", "c"];
let it: std::str::Split<'_, char> = "a b c".split(' ');
print_strings(&arr);
print_strings(&vec);
print_strings(it);
}
Using <Item = &'a str>, the arr and vec calls don't compile. If, instead, I use <Item = &'a'a str>, they work, but the it call doesn't compile.
Of course, I can make the Item type generic too, and do
fn print_strings<'a, I: std::fmt::Display>(seq: impl IntoIterator<Item = I>)
but it's getting silly. Surely there must be a single canonical "iterator of string values" type?
The error you are seeing is expected because seq is &Vec<&str> and &Vec<T> implements IntoIterator with Item=&T, so with your code, you end up with Item=&&str where you are expecting it to be Item=&str in all cases.
The correct way to do this is to expand Item type so that is can handle both &str and &&str. You can do this by using more generics, e.g.
fn print_strings(seq: impl IntoIterator<Item = impl AsRef<str>>) {
for s in seq {
let s = s.as_ref();
println!("- {}", s);
}
}
This requires the Item to be something that you can retrieve a &str from, and then in your loop .as_ref() will return the &str you are looking for.
This also has the added bonus that your code will also work with Vec<String> and any other type that implements AsRef<str>.
TL;DR The signature you use is fine, it's the callers that are providing iterators with wrong Item - but can be easily fixed.
As explained in the other answer, print_string() doesn't accept &arr and &vec because IntoIterator for &[T; n] and &Vec<T> yield references to T. This is because &Vec, itself a reference, is not allowed to consume the Vec in order to move T values out of it. What it can do is hand out references to T items sitting inside the Vec, i.e. items of type &T. In the case of your callers that don't compile, the containers contain &str, so their iterators hand out &&str.
Other than making print_string() more generic, another way to fix the issue is to call it correctly to begin with. For example, these all compile:
print_strings(arr.iter().map(|sref| *sref));
print_strings(vec.iter().copied());
print_strings(it);
Playground
iter() is the method provided by slices (and therefore available on arrays and Vec) that iterates over references to elements, just like IntoIterator of &Vec. We call it explicitly to be able to call map() to convert &&str to &str the obvious way - by using the * operator to dereference the &&str. The copied() iterator adapter is another way of expressing the same, possibly a bit less cryptic than map(|x| *x). (There is also cloned(), equivalent to map(|x| x.clone()).)
It's also possible to call print_strings() if you have a container with String values:
let v = vec!["foo".to_owned(), "bar".to_owned()];
print_strings(v.iter().map(|s| s.as_str()));

How to properly pass Iterators to a function in Rust

I want to pass Iterators to a function, which then computes some value from these iterators.
I am not sure how a robust signature to such a function would look like.
Lets say I want to iterate f64.
You can find the code in the playground: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c614429c541f337adb102c14518cf39e
My first attempt was
fn dot(a : impl std::iter::Iterator<Item = f64>,b : impl std::iter::Iterator<Item = f64>) -> f64 {
a.zip(b).map(|(x,y)| x*y).sum()
}
This fails to compile if we try to iterate over slices
So you can do
fn dot<'a>(a : impl std::iter::Iterator<Item = &'a f64>,b : impl std::iter::Iterator<Item = &'a f64>) -> f64 {
a.zip(b).map(|(x,y)| x*y).sum()
}
This fails to compile if I try to iterate over mapped Ranges.
(Why does the compiler requires the livetime parameters here?)
So I tried to accept references and not references generically:
pub fn dot<T : Borrow<f64>, U : Borrow<f64>>(a : impl std::iter::Iterator::<Item = T>, b: impl std::iter::Iterator::<Item = U>) -> f64 {
a.zip(b).map(|(x,y)| x.borrow()*y.borrow()).sum()
}
This works with all combinations I tried, but it is quite verbose and I don't really understand every aspect of it.
Are there more cases?
What would be the best practice of solving this problem?
There is no right way to write a function that can accept Iterators, but there are some general principles that we can apply to make your function general and easy to use.
Write functions that accept impl IntoIterator<...>. Because all Iterators implement IntoIterator, this is strictly more general than a function that accepts only impl Iterator<...>.
Borrow<T> is the right way to abstract over T and &T.
When trait bounds get verbose, it's often easier to read if you write them in where clauses instead of in-line.
With those in mind, here's how I would probably write dot:
fn dot<I, J>(a: I, b: J) -> f64
where
I: IntoIterator,
J: IntoIterator,
I::Item: Borrow<f64>,
J::Item: Borrow<f64>,
{
a.into_iter()
.zip(b)
.map(|(x, y)| x.borrow() * y.borrow())
.sum()
}
However, I also agree with TobiP64's answer in that this level of generality may not be necessary in every case. This dot is nice because it can accept a wide range of arguments, so you can call dot(&some_vec, some_iterator) and it just works. It's optimized for readability at the call site. On the other hand, if you find the Borrow trait complicates the definition too much, there's nothing wrong with optimizing for readability at the definition, and forcing the caller to add a .iter().copied() sometimes. The only thing I would definitely change about the first dot function is to replace Iterator with IntoIterator.
You can iterate over slices with the first dot implementation like that:
dot([0, 1, 2].iter().cloned(), [0, 1, 2].iter().cloned());
(https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.cloned)
or
dot([0, 1, 2].iter().copied(), [0, 1, 2].iter().copied());
(https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.copied)
Why does the compiler requires the livetime parameters here?
As far as I know every reference in rust has a lifetime, but the compiler can infer simple it in cases. In this case, however the compiler is not yet smart enough, so you need to tell it how long the references yielded by the iterator lives.
Are there more cases?
You can always use iterator methods, like the solution above, to get an iterator over f64, so you don't have to deal with lifetimes or generics.
What would be the best practice of solving this problem?
I would recommend the first version (and thus leaving it to the caller to transform the iterator to Iterator<f64>), simply because it's the most readable.

Is there a way to remove entries from a generic first vector that are present in another vector?

I have a problem understanding ownership when a higher order function is called. I am supposed to remove entries from the first vector if the elements exist in the second vector so I came up with this attempt:
fn array_diff<T: PartialEq>(a: Vec<T>, b: Vec<T>) -> Vec<T> {
a.iter()
.filter(|incoming| !b.contains(incoming))
.collect::<Vec<T>>()
}
I can't change the function signature. The .collect() call doesn't work because all I am getting is a reference to elements in a. While this is generic, I don't know if the result is copy-able or clone-able. I also probably can't dereference the elements in a.
Is there a way to fix this piece of code without rewriting it from scratch?
For this particular test ... you can consume the vector instead of relying on references. The signature yields values and not references. As such, to pass the test you only have to use into_iter instead:
a.into_iter() // <----------- call into_iter
.filter(|incoming| !b.contains(incoming))
.collect::<Vec<T>>()
This consumes the values and returns them out again.
Destroying the incoming allocation to create a new allocation isn't very efficient. Instead, write code that is more directly in line with the problem statement:
fn array_diff<T: PartialEq>(mut a: Vec<T>, b: Vec<T>) -> Vec<T> {
a.retain(|aa| !b.contains(aa));
a
}
Adding mut in the signature doesn't change the signature because no one can tell that you've added it. It's the exact same as:
fn array_diff<T: PartialEq>(a: Vec<T>, b: Vec<T>) -> Vec<T> {
let mut a = a;
a.retain(|aa| !b.contains(aa));
a
}

Mutable multidimensional array as a function argument

In rustc 1.0.0, I'd like to write a function that mutates a two dimensional array supplied by the caller. I was hoping this would work:
fn foo(x: &mut [[u8]]) {
x[0][0] = 42;
}
fn main() {
let mut x: [[u8; 3]; 3] = [[0; 3]; 3];
foo(&mut x);
}
It fails to compile:
$ rustc fail2d.rs
fail2d.rs:7:9: 7:15 error: mismatched types:
expected `&mut [[u8]]`,
found `&mut [[u8; 3]; 3]`
(expected slice,
found array of 3 elements) [E0308]
fail2d.rs:7 foo(&mut x);
^~~~~~
error: aborting due to previous error
I believe this is telling me I need to somehow feed the function a slice of slices, but I don't know how to construct this.
It "works" if I hard-code the nested array's length in the function signature. This isn't acceptable because I want the function to operate on multidimensional arrays of arbitrary dimension:
fn foo(x: &mut [[u8; 3]]) { // FIXME: don't want to hard code length of nested array
x[0][0] = 42;
}
fn main() {
let mut x: [[u8; 3]; 3] = [[0; 3]; 3];
foo(&mut x);
}
tldr; any zero-cost ways of passing a reference to a multidimensional array such that the function use statements like $x[1][2] = 3;$?
This comes down to a matter of memory layout. Assuming a type T with a size known at compile time (this constraint can be written T: Sized), the size of [T; n] is known at compile time (it takes n times as much memory as T does); but [T] is an unsized type; its length is not known at compile time. Therefore it can only be used through some form of indirection, such as a reference (&[T]) or a box (Box<[T]>, though this is of limited practical value, with Vec<T> which allows you to add and remove items without needing to reallocate every single time by using overallocation).
A slice of an unsized type doesn’t make sense; it’s permitted for reasons that are not clear to me, but you can never actually have an instance of it. (Vec<T>, by comparison, requires T: Sized.)
&[T; n] can coerce to &[T], and &mut [T; n] to &mut [T], but this only applies at the outermost level; the contents of slice is fixed (you’d need to create a new array or vector to achieve such a transformation, because the memory layout of each item is different). The effect of this is that arrays work for single‐dimensional work, but for multi‐dimensional work they fall apart. Arrays are currently very much second‐class citizens in Rust, and will be until the language supports making slices generic over length, which it is likely to eventually.
I recommend that you use either a single‐dimensional array (suitable for square matrices, indexed by x * width + y or similar), or vectors (Vec<Vec<T>>). There may also be libraries already out there abstracting over a suitable solution.

Resources