Efficient Rust collector of Results holding Vecs - rust

I'm learning Rust, and I've come upon the following pattern which collapses an iterator of Result<Vec<_>, _>s to a single big Vec<_>, failing if any of the results from the iterator failed:
fn accumulate<T, E>(it: impl Iterator<Item = Result<Vec<T>, E>>) -> Result<Vec<T>, E> {
let mut result = Vec::new();
for mut ts in it {
result.append(&mut ts?)
}
Ok(result)
}
I assume a very short "functional-style" version of this function can be written, and I'm struggling to find it. Morally, I'd like to do something like
it.map(|v| v?.into_iter()).flatten().collect()
but this doesn't typecheck. By running small examples, I think the point of the flatten there is to silently drop error results, but I'd instead like to somehow "map the flatten under the Results". I know also that in general you couldn't collect, say, an iterator of type
impl Iterator<Item = Result<impl Iterator<Item = T>, Error>>
into an iterator
Result<impl Iterator<Item = impl Iterator<Item = T>>, Error>
since you need to have actually done all of the computations in the outer iterator in order to know the final result. Nonetheless, it seems that you can make this work in this special case, when you want to .flatten() and then .collect() right after.
Finally, I can see that that collect() gives me a way to build a vector of vectors from it, and then I could flatten this vector into the single big vector I want. But this has many needless memory allocations.
Can the standard library help you do this in an efficient, Rust-ic way?

I think I would start with try_fold, as it can deal with Result and stop on Err:
fn acc2<T, E>(mut it: impl Iterator<Item = Result<Vec<T>, E>>) -> Result<Vec<T>, E> {
it.try_fold(
Vec::new(),
|mut vec, res_ts: Result<Vec<_>, E>| {
res_ts.map(move |mut ts| { // map preserves Err
// In case of Ok(...), append to already found elements
vec.append(&mut ts);
vec
})
}
)
}
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=f6f738ddedecda1875df283f221dbfdc
It turns out, Itertools already has fold_results that should do what you want:
fn acc3<T, E>(mut it: impl Iterator<Item = Result<Vec<T>, E>>) -> Result<Vec<T>, E> {
it.fold_results(
Vec::new(),
|mut vec, mut ts| {
vec.append(&mut ts);
vec
}
)
}

To achieve this only using iterator methods:
use std::iter::{self, Iterator};
pub fn accumulate<T, E>(it: impl Iterator<Item = Result<Vec<T>, E>>) -> Result<Vec<T>, E> {
it.flat_map(|v| {
v.map_or_else(
|e| Iter::A(iter::once(Err(e))),
|t| Iter::B(t.into_iter().map(Ok)),
)
})
.collect()
}
// Utility enum that can be generated by the #[auto_enum] derive macro
enum Iter<T, A: Iterator<Item = T>, B: Iterator<Item = T>> {
A(A),
B(B),
}
impl<T, A: Iterator<Item = T>, B: Iterator<Item = T>> Iterator for Iter<T, A, B> {
type Item = T;
fn next(&mut self) -> Option<T> {
match self {
Self::A(a) => a.next(),
Self::B(b) => b.next(),
}
}
}
This uses flat_map to yield either an iterator of Oks or an iterator of an Err for each entry.
This is semantically equivalent to your control flow code using for loop.
Playground: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=68558e27900940476e443d670a120e91
See auto_enums for deriving an enum delegating Iterator variants.
Alternatively, you can use either::Either in place of Iter, which has the same implementation for two items:
https://docs.rs/either/1.5.3/either/enum.Either.html#impl-Iterator

Related

Is there a nicer way to implement Display for structs that own collections of things with DIsplay?

I keep finding myself writing Display for structs that hold Vec of some type that implements Display. For example:
use std::fmt::Display;
struct VarTerm {
pub coeffecient: usize,
pub var_name: String,
}
impl Display for VarTerm {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "{}{}", self.coeffecient, self.var_name)
}
}
struct Function {
pub terms: Vec<VarTerm>,
}
impl Display for Function {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
let strings = self
.terms
.iter()
.map(|s| format!("{}", s))
.collect::<Vec<String>>()
.join(" + ");
write!(f, "{}", strings)
}
}
fn main() {
let my_function = Function {
terms: vec![
VarTerm {coeffecient: 2,var_name: "x".to_string(),},
VarTerm {coeffecient: 4,var_name: "y".to_string(),},
VarTerm {coeffecient: 5,var_name: "z".to_string(),},
],
};
println!("All that work to print something: {}", my_function)
}
This looks bulky and ugly to me in a bunch of places - coming from higher-level languages I'm never a fan of the .iter()/.collect() sandwiching (I kind of get why it's needed but it's annoying when 90+ percent of the time I'm just going from Vec to Vec). In this case it's also compounded by the format!() call, which I swear has to be the wrong way to do that.
I'm not sure how much of this is inherent to Rust and how much is me not knowing the right way. I want to get as close as possible to something like:
self.terms.map(toString).join(" + "), which is about what I'd expect in something like Scala.
How close can I get to there? Along the way, is there anything to be done about the aforementioned iter/collect sandwiching in general?
In an eerie coincidence, literally 2 minutes ago I looked at a few methods in the itertools crate. How about this one:
https://docs.rs/itertools/latest/itertools/trait.Itertools.html#method.join
fn join(&mut self, sep: &str) -> String
where
Self::Item: Display
Combine all iterator elements into one String, separated by sep.
Use the Display implementation of each element.
use itertools::Itertools;
assert_eq!(["a", "b", "c"].iter().join(", "), "a, b, c");
assert_eq!([1, 2, 3].iter().join(", "), "1, 2, 3");
EDIT: Additionally, whenever you ask yourself if there was a nicer way to implement a particular trait, especially when the implementation would be somewhat recursive, you should look if there's a derive macro for that trait. Turns out there is, albeit in a separate crate:
https://jeltef.github.io/derive_more/derive_more/display.html
Example:
#[derive(Display)]
#[display(fmt = "({}, {})", x, y)]
struct Point2D {
x: i32,
y: i32,
}
If you find yourself repeating this a lot you might want to move the .into_iter()/.collect() into a specific trait at the cost of generality and composability.
You can also pass ToString::to_string to map.
impl Display for Function {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
let strings = self.terms.map(ToString::to_string).join(" + ");
f.write_str(&strings)
}
}
trait VecMap<'a, T: 'a, U>: IntoIterator<Item = T> {
fn map(self, f: impl FnMut(T) -> U) -> Vec<U>;
}
impl<'a, T: 'a, U> VecMap<'a, &'a T, U> for &'a Vec<T> {
fn map(self, f: impl FnMut(&'a T) -> U) -> Vec<U> {
self.into_iter().map(f).collect()
}
}

How to display value contained in Result or a generic message if Result is Err

I would like to print T if Ok(T) or a generic message and do this as part of the same println!() statement.
My Current solution which works is:
fn main() {
let x: std::io::Result<i32> = Ok(54);
println!("hello {} ", x.map(|i| i.to_string()).unwrap_or("Bad".to_string()));
}
Is there a simpler and more efficient way that would leverage the Display trait instead of needing to convert to string inside map and unwrap_or?
You can cast each of the Result values as &dyn Display
let display = match &x {
Ok(i) => i as &dyn Display,
Err(_) => &"error" as &dyn Display,
};
// or with map
let display = x
.as_ref()
.map(|x| x as &dyn Display)
.unwrap_or_else(|_| &"error" as &dyn Display);
println!(
"hello {} ",
display
)
Both can be in-lined into the println!, I have just separated them for readability.
While the accepted solution works and answers your question exactly, I don't think it's the most readable (or performant). (Correction: it's in fact faster.) A trait object feels like overkill to me. Consider using the same approach as yours, but instead, merge the map() and unwrap_or() into map_or() (or map_or_else(), to avoid an unnecessary heap allocation).
use std::io;
fn main() {
let x: io::Result<i32> = Ok(54);
println!("Hello, {}.", x.map_or_else(|_err| "Bad".into(), |i| i.to_string()));
}
Run this snippet on Rust Playground.
If you're not going to use the error value, you can make the code a touch more concise by renaming _err to _.
On the other hand, if you're going to be doing this often and in many places, you can create a wrapper type and implement your Display logic for that.
use std::fmt::{self, Display};
use std::io;
fn main() {
let x: io::Result<i32> = Ok(54);
println!("Hello, {}.", PrintableResult(&x));
}
struct PrintableResult<'a, T: Display, E>(&'a Result<T, E>);
impl<'a, T: Display, E> Display for PrintableResult<'a, T, E> {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(
f,
"{}",
self.0
.as_ref()
.map_or_else(|_| "Bad".into(), |val| val.to_string())
)
}
}
Run this snippet on Rust Playground.
If you don't mind PrintableResult taking ownership of your Result value, you can drop the borrow and the lifetime specifier with it.
If you'd like, you can also implement a to_printable() extension method for Result (via a trait).
// ...
fn main() {
let x: io::Result<i32> = Ok(54);
println!("Hello, {}.", x.to_printable());
let y: Result<i32, &'static str> = Err("Something went wrong.");
println!("Hello, {}.", y.to_printable());
}
// ...
trait ToPrintable<T: Display, E> {
fn to_printable(&self) -> PrintableResult<T, E>;
}
impl<T: Display, E> ToPrintable<T, E> for Result<T, E> {
fn to_printable(&self) -> PrintableResult<T, E> {
PrintableResult(&self)
}
}
Run this snippet on Rust Playground.

Don't allow user to override value of DerefMut struct, but allow it to execute mutable function on it

I'm currently develloping my own library for vectors and matrices, and to simplify my life, I defined my Matrix to be a Vec of Vector, and defined the Deref trait as such:
pub struct Matrix(Vec<RowVector>);
impl Deref for Matrix {
type Target = [RowVector];
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl DerefMut for Matrix {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.0
}
}
This work like a charm, but it has one flaw: you can override one row to be a RowVector of a different size of the rest, which is obviously VERY BAD.
Am I doomed? is there a solution to disallow the overwrite but allow to mutate the Vector ?
You could implement Index and IndexMut over a pair (usize, usize):
use std::ops::{IndexMut, Index};
pub struct Matrix(Vec<Vec<usize>>);
impl Index<(usize, usize)> for Matrix {
type Output = usize;
fn index(&self, index: (usize, usize)) -> &Self::Output {
self.0.get(index.0).unwrap().get(index.1).unwrap()
}
}
impl IndexMut<(usize, usize)> for Matrix {
fn index_mut(&mut self, index: (usize, usize)) -> &mut Self::Output {
self.0.get_mut(index.0).unwrap().get_mut(index.1).unwrap()
}
}
Playground
Disclaimer: Please take into account that using unwrap is not clean here. Either assert lengths, deal with options or at least use expect depending on your needs.

IntoIterator as a function argument doesn't accept adapter struct

I want to have a function, that accepts &IntoIterator<Item=u32>, so I could pass to it both &Vec<u32> and iterators' adapter structs (like Map, Filter and any other, which I believe all implement IntoIterator)
So I have a function like
pub fn f<'a, T>(it_src: &'a T) -> u32
where &'a T: IntoIterator<Item = u32> {
let it = it_src.into_iter();
let result: u32;
// more more usage
result
}
And this is how I tried to use it (same signature, but different name)
pub fn f_with_feature()<'a, T>(it_src: &'a T) -> u32
where &'a T: IntoIterator<Item = u32> {
let adjusted_values = it_src.into_iter()
.map(|e| adjust(e));
f(&adjusted_values)
}
What I've got is an error
error[E0308]: mismatched types
--> src\main.rs:14:7
|
14 | f(&adjusted_values)
| ^^^^^^^^^^^^^^^^ expected type parameter, found struct `std::iter::Map`
|
= note: expected type `&T`
found type `&std::iter::Map<<&T as std::iter::IntoIterator>::IntoIter, [closure#src\main.rs:13:14: 13:27]>`
How is it that Map doesn't match as T?
Also, I've come up with an idea, that passing iterators' adaptors with static dispatch isn't a good idea since each other closure used to generate a Map will create a new function specialization. Though I've seen that static dispatch approach for most of the times is idiomatic in Rust. How to manage this situation?
I think you want to have trait bounds on T (and not on &'a T). So I guess you actually want the following:
pub fn f<'a, T>(it_src: &'a T) -> u32
where T: IntoIterator<Item = u32> {
let it = it_src.into_iter();
let result: u32 = 1;
// more more usage
result
}
pub fn f_with_feature<'a, T>(it_src: &'a T) -> u32
where T: IntoIterator<Item = u32> {
let adjusted_values = it_src.into_iter()
.map(|e| adjust(e));
f(&adjusted_values)
}
Which brings us to the next problem: IntoIterator's into_iter consumes self, which means that you cannot call it_src.into_iter if you only borrow it_src.
So if you really want to use into_iter, you can try this:
pub fn f<T>(it_src: T) -> u32
where T: IntoIterator<Item = u32> {
let it = it_src.into_iter();
let result: u32 = 1;
// more more usage
result
}
pub fn f_with_feature<T>(it_src: T) -> u32
where T: IntoIterator<Item = u32> {
let adjusted_values = it_src.into_iter()
.map(|e| adjust(e));
f(adjusted_values)
}
The above, however, requires you to move the values into f resp. f_with_feature.
In my experience, just taking an iterator (and doing the conversion at call site if necessary), leads to simple, straightforward solutions:
pub fn f<T>(it_src: T) -> u32
where T: Iterator<Item = u32> {
let it = it_src.into_iter();
let result: u32 = 1;
// more more usage
result
}
pub fn f_with_feature<T>(it_src: T) -> u32
where T: Iterator<Item = u32> {
let adjusted_values = it_src.into_iter()
.map(|e| adjust(e));
f(adjusted_values)
}

Index and IndexMut implementations to return borrowed vectors

I've been working on a multi-dimensional array library, toying around with different interfaces, and ran into an issue I can't seem to solve. This may be a simple misunderstanding of lifetimes, but I've tried just about every solution I can think of, to no success.
The goal: implement the Index and IndexMut traits to return a borrowed vector from a 2d matrix, so this syntax can be used mat[rowind][colind].
A (very simplified) version of the data structure definition is below.
pub struct Matrix<T> {
shape: [uint, ..2],
dat: Vec<T>
}
impl<T: FromPrimitive+Clone> Matrix<T> {
pub fn new(shape: [uint, ..2]) -> Matrix<T> {
let size = shape.iter().fold(1, |a, &b| { a * b});
// println!("Creating MD array of size: {} and shape: {}", size, shape)
Matrix{
shape: shape,
dat: Vec::<T>::from_elem(size, FromPrimitive::from_uint(0u).expect("0 must be convertible to parameter type"))
}
}
pub fn mut_index(&mut self, index: uint) -> &mut [T] {
let base = index*self.shape[1];
self.dat.mut_slice(base, base + self.shape[1])
}
}
fn main(){
let mut m = Matrix::<f32>::new([4u,4]);
println!("{}", m.dat)
println!("{}", m.mut_index(3)[0])
}
The mut_index method works exactly as I would like the IndexMut trait to work, except of course that it doesn't have the syntax sugar. The first attempt at implementing IndexMut made me wonder, since it returns a borrowed reference to the specified type, I really want to specify [T] as a type, but it isn't a valid type. So the only option is to specify &mut [T] like this.
impl<T: FromPrimitive+Clone> IndexMut<uint, &mut [T]> for Matrix<T> {
fn index_mut(&mut self, index: &uint) -> &mut(&mut[T]) {
let base = index*self.shape[1];
&mut self.dat.mut_slice(base, base + self.shape[1])
}
}
This complains about a missing lifetime specifier on the trait impl line. So I try adding one.
impl<'a, T: FromPrimitive+Clone> IndexMut<uint, &'a mut [T]> for Matrix<T> {
fn index_mut(&'a mut self, index: &uint) -> &mut(&'a mut[T]) {
let base = index*self.shape[1];
&mut self.dat.mut_slice(base, base + self.shape[1])
}
}
Now I get method `index_mut` has an incompatible type for trait: expected concrete lifetime, but found bound lifetime parameter 'a [E0053]. Aside from this I've tried just about every combination of one and two lifetimes I can think of, as well as creating a secondary structure to hold a reference that is stored in the outer structure during the indexing operation so a reference to that can be returned instead, but that's not possible for Index. The final answer may just be that this isn't possible, given the response on this old github issue, but that would seem to be a problematic limitation of the Index and IndexMut traits. Is there something I'm missing?
At present, this is not possible, but when Dynamically Sized Types lands I believe it will become possible.
Let’s look at the signature:
pub trait IndexMut<Index, Result> {
fn index_mut<'a>(&'a mut self, index: &Index) -> &'a mut Result;
}
(Note the addition of the <'a> compared with what the docs say; I’ve filed #16228 about that.)
'a is an arbitrary lifetime, but it is important that it is specified on the method, not on the impl as a whole: it is in absolute truth a generic parameter to the method. I’ll show how it all comes out here with the names 'ρ₀ and 'ρ₁. So then, in this attempt:
impl<'ρ₀, T: FromPrimitive + Clone> IndexMut<uint, &'ρ₀ mut [T]> for Matrix<T> {
fn index_mut<'ρ₁>(&'ρ₁ mut self, index: &uint) -> &'ρ₁ mut &'ρ₀ mut [T] {
let base = index * self.shape[1];
&mut self.dat.mut_slice(base, base + self.shape[1])
}
}
This satisfies the requirements that (a) all lifetimes must be explicit in the impl header, and (b) that the method signature matches the trait definition: Index is uint and Result is &'ρ₀ mut [T]. Because 'ρ₀ is defined on the impl block (so that it can be used as a parameter there) and 'ρ₁ on the method (because that’s what the trait defines), 'ρ₀ and 'ρ₁ cannot be combined into a single named lifetime. (You could call them both 'a, but this is shadowing and does not change anything except for the introduction of a bit more confusion!)
However, this is not enough to have it all work, and it will indeed not compile, because 'ρ₀ is not tied to anything, nor is there to tie it to in the signature. And so you cannot cast self.data.mut_slice(…), which is of type &'ρ₁ mut [T], to &'ρ₀ mut [T] as the lifetimes do not match, nor is there any known subtyping relationship between them (that is, it cannot structurally be demonstrated that the lifetime 'ρ₀ is less than—a subtype of—'ρ₁; although the return type of the method would make that clear, it is not so at the basic type level, and so it is not permitted).
Now as it happens, IndexMut isn’t as useful as it should be anyway owing to #12825, as matrix[1] would always use IndexMut and never Index if you have implemented both. I’m not sure if that’s any consolation, though!
The solution comes in Dynamically Sized Types. When that is here, [T] will be a legitimate unsized type which can be used as the type for Result and so this will be the way to write it:
impl<T: FromPrimitive + Clone> IndexMut<uint, [T]> for Matrix<T> {
fn index_mut<'a>(&'a mut self, index: &uint) -> &'a mut [T] {
let base = index * self.shape[1];
&mut self.dat.mut_slice(base, base + self.shape[1])
}
}
… but that’s not here yet.
This code works in Rust 1.25.0 (and probably has for quite a while)
extern crate num;
use num::Zero;
pub struct Matrix<T> {
shape: [usize; 2],
dat: Vec<T>,
}
impl<T: Zero + Clone> Matrix<T> {
pub fn new(shape: [usize; 2]) -> Matrix<T> {
let size = shape.iter().product();
Matrix {
shape: shape,
dat: vec![T::zero(); size],
}
}
pub fn mut_index(&mut self, index: usize) -> &mut [T] {
let base = index * self.shape[1];
&mut self.dat[base..][..self.shape[1]]
}
}
fn main() {
let mut m = Matrix::<f32>::new([4; 2]);
println!("{:?}", m.dat);
println!("{}", m.mut_index(3)[0]);
}
You can enhance it to support Index and IndexMut:
use std::ops::{Index, IndexMut};
impl<T> Index<usize> for Matrix<T> {
type Output = [T];
fn index(&self, index: usize) -> &[T] {
let base = index * self.shape[1];
&self.dat[base..][..self.shape[1]]
}
}
impl<T> IndexMut<usize> for Matrix<T> {
fn index_mut(&mut self, index: usize) -> &mut [T] {
let base = index * self.shape[1];
&mut self.dat[base..][..self.shape[1]]
}
}
fn main() {
let mut m = Matrix::<f32>::new([4; 2]);
println!("{:?}", m.dat);
println!("{}", m[3][0]);
m[3][0] = 42.42;
println!("{:?}", m.dat);
println!("{}", m[3][0]);
}

Resources