Convert iterator of Result to Result<Vec<_>> [duplicate]

Convert iterator of Result to Result<Vec<_>> [duplicate] - rust

I have a function that returns a Result:
fn find(id: &Id) -> Result<Item, ItemError> {
// ...
}
Then another using it like this:
let parent_items: Vec<Item> = parent_ids.iter()
.map(|id| find(id).unwrap())
.collect();
How do I handle the case of failure inside any of the map iterations?
I know I could use flat_map and in this case the error results would be ignored:
let parent_items: Vec<Item> = parent_ids.iter()
.flat_map(|id| find(id).into_iter())
.collect();
Result's iterator has either 0 or 1 items depending on the success state, and flat_map will filter it out if it's 0.
However, I don't want to ignore errors, I want to instead make the whole code block just stop and return a new error (based on the error that came up within the map, or just forward the existing error).
How do I best handle this in Rust?

Result implements FromIterator, so you can move the Result outside and iterators will take care of the rest (including stopping iteration if an error is found).
#[derive(Debug)]
struct Item;
type Id = String;
fn find(id: &Id) -> Result<Item, String> {
Err(format!("Not found: {:?}", id))
}
fn main() {
let s = |s: &str| s.to_string();
let ids = vec![s("1"), s("2"), s("3")];
let items: Result<Vec<_>, _> = ids.iter().map(find).collect();
println!("Result: {:?}", items);
}
Playground

The accepted answer shows how to stop on error while collecting, and that's fine because that's what the OP requested. If you need processing that also works on large or infinite fallible iterators, read on.
As already noted, for can be used to emulate stop-on-error, but that is sometimes inelegant, as when you want to call max() or other method that consumes the iterator. In other situations it's next to impossible, as when the iterator is consumed by code in another crate, such as itertools or Rayon1.
Iterator consumer: try_for_each
When you control how the iterator is consumed, you can just use try_for_each to stop on first error. It accepts a closure that returns a Result, and try_for_each() will return Ok(()) if the closure returned Ok every time, and the first Err on the first error. This allows the closure to detect errors simply by using the ? operator in the natural way:
use std::{fs, io};
fn main() -> io::Result<()> {
fs::read_dir("/")?.try_for_each(|e| -> io::Result<()> {
println!("{}", e?.path().display());
Ok(())
})?;
// ...
Ok(())
}
If you need to maintain state between the invocations of the closure, you can also use try_fold. Both methods are implemented by ParallelIterator, so the same pattern works with Rayon.
try_for_each() does require that you control how the iterator is consumed. If that is done by code not under your control - for example, if you are passing the iterator to itertools::merge() or similar, you will need an adapter.
Iterator adapter: scan
The first attempt at stopping on error is to use take_while:
use std::{io, fs};
fn main() -> io::Result<()> {
fs::read_dir("/")?
.take_while(Result::is_ok)
.map(Result::unwrap)
.for_each(|e| println!("{}", e.path().display()));
// ...
Ok(())
}
This works, but we don't get any indication that an error occurred, the iteration just silently stops. Also it requires the unsightly map(Result::unwrap) which makes it seem like the program will panic on error, which is in fact not the case as we stop on error.
Both issues can be addressed by switching from take_while to scan, a more powerful combinator that not only supports stopping the iteration, but passes its callback owned items, allowing the closure to extract the error to the caller:
fn main() -> io::Result<()> {
let mut err = Ok(());
fs::read_dir("/")?
.scan(&mut err, |err, res| match res {
Ok(o) => Some(o),
Err(e) => {
**err = Err(e);
None
}
})
.for_each(|e| println!("{}", e.path().display()));
err?;
// ...
Ok(())
}
If needed in multiple places, the closure can be abstracted into a utility function:
fn until_err<T, E>(err: &mut &mut Result<(), E>, item: Result<T, E>) -> Option<T> {
match item {
Ok(item) => Some(item),
Err(e) => {
**err = Err(e);
None
}
}
}
...in which case we can invoke it as .scan(&mut err, until_err) (playground).
These examples trivially exhaust the iterator with for_each(), but one can chain it with arbitrary manipulations, including Rayon's par_bridge(). Using scan() it is even possible to collect() the items into a container and have access to the items seen before the error, which is sometimes useful and unavailable when collecting into Result<Container, Error>.
1 Needing to use par_bridge() comes up when using Rayon to process streaming data in parallel:
fn process(input: impl BufRead + Send) -> std::Result<Output, Error> {
let mut err = Ok(());
let output = lines
.input()
.scan(&mut err, until_err)
.par_bridge()
.map(|line| ... executed in parallel ... )
.reduce(|item| ... also executed in parallel ...);
err?;
...
Ok(output)
}
Again, equivalent effect cannot be trivially achieved by collecting into Result.

Handling nested .map() closure Result's
What if we have a .map() within a .map() within a .map()?
Here's an example for the specific case where the .map() operations are nested. The problem it solves is how to propagate a failure from the innermost closure while avoiding using .unwrap() which aborts the application.
This approach also enables using ? syntax at the outer layer to capture the error if one occurs, or unwrap the result to assign to a variable if no error occurred. ? can't otherwise be used from inside the closures.
.parse() as it's used below will return Result<T, ParseIntError>.
use std::error::Error;
const DATA: &str = "1 2 3 4\n5 6 7 8";
fn main() -> Result<(), Box<dyn Error>>
{
let data = DATA.lines().map(|l| l.split_whitespace()
.map(|n| n.parse() /* can fail */)
.collect())
.collect::<Result<Vec<Vec<i32>>, _>>()?;
println!("{:?}", data);
Ok(())
}
Note that the outer .collect::<..>() generic expression specifies Result<Vec<Vec<..>>. The inner .collect() will be producing Results, which are stripped away by the outer Result as it takes the Ok contents and produces the 2-D vector.
Without relying heavily on type inference, the inner .collect() generic expression would look like this:
.collect::<Result<Vec<i32>, _>>()) // <--- Inner.
.collect::<Result<Vec<Vec<i32>>, _>>()?; // <--- Outer.
Using the ? syntax, the variable, data, will be assigned this 2-D vector; or the main() function will return a parsing error that originated from within the inner closure.
output:
[[1, 2, 3, 4], [5, 6, 7, 8]]
Taking it a step further, parse results nested three levels deep can be handled this way.
type Vec3D<T, E> = Result<Vec<Vec<Vec<T>>>, E>;
const DATA: &str = "1 2 | 3 4\n5 6 | 7 8";
fn main() -> Result<(), Box<dyn Error>>
{
let data = DATA.lines()
.map(|a| a.split("|")
.map(|b| b.split_whitespace()
.map(|c| c.parse()) // <---
.collect())
.collect())
.collect::<Vec3D<i32,_>>()?;
println!("{:?}", data);
Ok(())
}
output:
[[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
Or if a number couldn't be parsed, we'd get:
Error: ParseIntError { kind: InvalidDigit }

This answer pertains to a pre-1.0 version of Rust and the required functions were removed
You can use std::result::fold function for this. It stops iterating after encountering the first Err.
An example program I just wrote:
fn main() {
println!("{}", go([1, 2, 3]));
println!("{}", go([1, -2, 3]));
}
fn go(v: &[int]) -> Result<Vec<int>, String> {
std::result::fold(
v.iter().map(|&n| is_positive(n)),
vec![],
|mut v, e| {
v.push(e);
v
})
}
fn is_positive(n: int) -> Result<int, String> {
if n > 0 {
Ok(n)
} else {
Err(format!("{} is not positive!", n))
}
}
Output:
Ok([1, 2, 3])
Err(-2 is not positive!)
Demo

Related

Break the function map() [duplicate]

I have a function that returns a Result:
fn find(id: &Id) -> Result<Item, ItemError> {
// ...
}
Then another using it like this:
let parent_items: Vec<Item> = parent_ids.iter()
.map(|id| find(id).unwrap())
.collect();
How do I handle the case of failure inside any of the map iterations?
I know I could use flat_map and in this case the error results would be ignored:
let parent_items: Vec<Item> = parent_ids.iter()
.flat_map(|id| find(id).into_iter())
.collect();
Result's iterator has either 0 or 1 items depending on the success state, and flat_map will filter it out if it's 0.
However, I don't want to ignore errors, I want to instead make the whole code block just stop and return a new error (based on the error that came up within the map, or just forward the existing error).
How do I best handle this in Rust?

Result implements FromIterator, so you can move the Result outside and iterators will take care of the rest (including stopping iteration if an error is found).
#[derive(Debug)]
struct Item;
type Id = String;
fn find(id: &Id) -> Result<Item, String> {
Err(format!("Not found: {:?}", id))
}
fn main() {
let s = |s: &str| s.to_string();
let ids = vec![s("1"), s("2"), s("3")];
let items: Result<Vec<_>, _> = ids.iter().map(find).collect();
println!("Result: {:?}", items);
}
Playground

The accepted answer shows how to stop on error while collecting, and that's fine because that's what the OP requested. If you need processing that also works on large or infinite fallible iterators, read on.
As already noted, for can be used to emulate stop-on-error, but that is sometimes inelegant, as when you want to call max() or other method that consumes the iterator. In other situations it's next to impossible, as when the iterator is consumed by code in another crate, such as itertools or Rayon1.
Iterator consumer: try_for_each
When you control how the iterator is consumed, you can just use try_for_each to stop on first error. It accepts a closure that returns a Result, and try_for_each() will return Ok(()) if the closure returned Ok every time, and the first Err on the first error. This allows the closure to detect errors simply by using the ? operator in the natural way:
use std::{fs, io};
fn main() -> io::Result<()> {
fs::read_dir("/")?.try_for_each(|e| -> io::Result<()> {
println!("{}", e?.path().display());
Ok(())
})?;
// ...
Ok(())
}
If you need to maintain state between the invocations of the closure, you can also use try_fold. Both methods are implemented by ParallelIterator, so the same pattern works with Rayon.
try_for_each() does require that you control how the iterator is consumed. If that is done by code not under your control - for example, if you are passing the iterator to itertools::merge() or similar, you will need an adapter.
Iterator adapter: scan
The first attempt at stopping on error is to use take_while:
use std::{io, fs};
fn main() -> io::Result<()> {
fs::read_dir("/")?
.take_while(Result::is_ok)
.map(Result::unwrap)
.for_each(|e| println!("{}", e.path().display()));
// ...
Ok(())
}
This works, but we don't get any indication that an error occurred, the iteration just silently stops. Also it requires the unsightly map(Result::unwrap) which makes it seem like the program will panic on error, which is in fact not the case as we stop on error.
Both issues can be addressed by switching from take_while to scan, a more powerful combinator that not only supports stopping the iteration, but passes its callback owned items, allowing the closure to extract the error to the caller:
fn main() -> io::Result<()> {
let mut err = Ok(());
fs::read_dir("/")?
.scan(&mut err, |err, res| match res {
Ok(o) => Some(o),
Err(e) => {
**err = Err(e);
None
}
})
.for_each(|e| println!("{}", e.path().display()));
err?;
// ...
Ok(())
}
If needed in multiple places, the closure can be abstracted into a utility function:
fn until_err<T, E>(err: &mut &mut Result<(), E>, item: Result<T, E>) -> Option<T> {
match item {
Ok(item) => Some(item),
Err(e) => {
**err = Err(e);
None
}
}
}
...in which case we can invoke it as .scan(&mut err, until_err) (playground).
These examples trivially exhaust the iterator with for_each(), but one can chain it with arbitrary manipulations, including Rayon's par_bridge(). Using scan() it is even possible to collect() the items into a container and have access to the items seen before the error, which is sometimes useful and unavailable when collecting into Result<Container, Error>.
1 Needing to use par_bridge() comes up when using Rayon to process streaming data in parallel:
fn process(input: impl BufRead + Send) -> std::Result<Output, Error> {
let mut err = Ok(());
let output = lines
.input()
.scan(&mut err, until_err)
.par_bridge()
.map(|line| ... executed in parallel ... )
.reduce(|item| ... also executed in parallel ...);
err?;
...
Ok(output)
}
Again, equivalent effect cannot be trivially achieved by collecting into Result.

Handling nested .map() closure Result's
What if we have a .map() within a .map() within a .map()?
Here's an example for the specific case where the .map() operations are nested. The problem it solves is how to propagate a failure from the innermost closure while avoiding using .unwrap() which aborts the application.
This approach also enables using ? syntax at the outer layer to capture the error if one occurs, or unwrap the result to assign to a variable if no error occurred. ? can't otherwise be used from inside the closures.
.parse() as it's used below will return Result<T, ParseIntError>.
use std::error::Error;
const DATA: &str = "1 2 3 4\n5 6 7 8";
fn main() -> Result<(), Box<dyn Error>>
{
let data = DATA.lines().map(|l| l.split_whitespace()
.map(|n| n.parse() /* can fail */)
.collect())
.collect::<Result<Vec<Vec<i32>>, _>>()?;
println!("{:?}", data);
Ok(())
}
Note that the outer .collect::<..>() generic expression specifies Result<Vec<Vec<..>>. The inner .collect() will be producing Results, which are stripped away by the outer Result as it takes the Ok contents and produces the 2-D vector.
Without relying heavily on type inference, the inner .collect() generic expression would look like this:
.collect::<Result<Vec<i32>, _>>()) // <--- Inner.
.collect::<Result<Vec<Vec<i32>>, _>>()?; // <--- Outer.
Using the ? syntax, the variable, data, will be assigned this 2-D vector; or the main() function will return a parsing error that originated from within the inner closure.
output:
[[1, 2, 3, 4], [5, 6, 7, 8]]
Taking it a step further, parse results nested three levels deep can be handled this way.
type Vec3D<T, E> = Result<Vec<Vec<Vec<T>>>, E>;
const DATA: &str = "1 2 | 3 4\n5 6 | 7 8";
fn main() -> Result<(), Box<dyn Error>>
{
let data = DATA.lines()
.map(|a| a.split("|")
.map(|b| b.split_whitespace()
.map(|c| c.parse()) // <---
.collect())
.collect())
.collect::<Vec3D<i32,_>>()?;
println!("{:?}", data);
Ok(())
}
output:
[[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
Or if a number couldn't be parsed, we'd get:
Error: ParseIntError { kind: InvalidDigit }

This answer pertains to a pre-1.0 version of Rust and the required functions were removed
You can use std::result::fold function for this. It stops iterating after encountering the first Err.
An example program I just wrote:
fn main() {
println!("{}", go([1, 2, 3]));
println!("{}", go([1, -2, 3]));
}
fn go(v: &[int]) -> Result<Vec<int>, String> {
std::result::fold(
v.iter().map(|&n| is_positive(n)),
vec![],
|mut v, e| {
v.push(e);
v
})
}
fn is_positive(n: int) -> Result<int, String> {
if n > 0 {
Ok(n)
} else {
Err(format!("{} is not positive!", n))
}
}
Output:
Ok([1, 2, 3])
Err(-2 is not positive!)
Demo

How do I return from top function inside a closure? [duplicate]

I have a function that returns a Result:
fn find(id: &Id) -> Result<Item, ItemError> {
// ...
}
Then another using it like this:
let parent_items: Vec<Item> = parent_ids.iter()
.map(|id| find(id).unwrap())
.collect();
How do I handle the case of failure inside any of the map iterations?
I know I could use flat_map and in this case the error results would be ignored:
let parent_items: Vec<Item> = parent_ids.iter()
.flat_map(|id| find(id).into_iter())
.collect();
Result's iterator has either 0 or 1 items depending on the success state, and flat_map will filter it out if it's 0.
However, I don't want to ignore errors, I want to instead make the whole code block just stop and return a new error (based on the error that came up within the map, or just forward the existing error).
How do I best handle this in Rust?

Result implements FromIterator, so you can move the Result outside and iterators will take care of the rest (including stopping iteration if an error is found).
#[derive(Debug)]
struct Item;
type Id = String;
fn find(id: &Id) -> Result<Item, String> {
Err(format!("Not found: {:?}", id))
}
fn main() {
let s = |s: &str| s.to_string();
let ids = vec![s("1"), s("2"), s("3")];
let items: Result<Vec<_>, _> = ids.iter().map(find).collect();
println!("Result: {:?}", items);
}
Playground

The accepted answer shows how to stop on error while collecting, and that's fine because that's what the OP requested. If you need processing that also works on large or infinite fallible iterators, read on.
As already noted, for can be used to emulate stop-on-error, but that is sometimes inelegant, as when you want to call max() or other method that consumes the iterator. In other situations it's next to impossible, as when the iterator is consumed by code in another crate, such as itertools or Rayon1.
Iterator consumer: try_for_each
When you control how the iterator is consumed, you can just use try_for_each to stop on first error. It accepts a closure that returns a Result, and try_for_each() will return Ok(()) if the closure returned Ok every time, and the first Err on the first error. This allows the closure to detect errors simply by using the ? operator in the natural way:
use std::{fs, io};
fn main() -> io::Result<()> {
fs::read_dir("/")?.try_for_each(|e| -> io::Result<()> {
println!("{}", e?.path().display());
Ok(())
})?;
// ...
Ok(())
}
If you need to maintain state between the invocations of the closure, you can also use try_fold. Both methods are implemented by ParallelIterator, so the same pattern works with Rayon.
try_for_each() does require that you control how the iterator is consumed. If that is done by code not under your control - for example, if you are passing the iterator to itertools::merge() or similar, you will need an adapter.
Iterator adapter: scan
The first attempt at stopping on error is to use take_while:
use std::{io, fs};
fn main() -> io::Result<()> {
fs::read_dir("/")?
.take_while(Result::is_ok)
.map(Result::unwrap)
.for_each(|e| println!("{}", e.path().display()));
// ...
Ok(())
}
This works, but we don't get any indication that an error occurred, the iteration just silently stops. Also it requires the unsightly map(Result::unwrap) which makes it seem like the program will panic on error, which is in fact not the case as we stop on error.
Both issues can be addressed by switching from take_while to scan, a more powerful combinator that not only supports stopping the iteration, but passes its callback owned items, allowing the closure to extract the error to the caller:
fn main() -> io::Result<()> {
let mut err = Ok(());
fs::read_dir("/")?
.scan(&mut err, |err, res| match res {
Ok(o) => Some(o),
Err(e) => {
**err = Err(e);
None
}
})
.for_each(|e| println!("{}", e.path().display()));
err?;
// ...
Ok(())
}
If needed in multiple places, the closure can be abstracted into a utility function:
fn until_err<T, E>(err: &mut &mut Result<(), E>, item: Result<T, E>) -> Option<T> {
match item {
Ok(item) => Some(item),
Err(e) => {
**err = Err(e);
None
}
}
}
...in which case we can invoke it as .scan(&mut err, until_err) (playground).
These examples trivially exhaust the iterator with for_each(), but one can chain it with arbitrary manipulations, including Rayon's par_bridge(). Using scan() it is even possible to collect() the items into a container and have access to the items seen before the error, which is sometimes useful and unavailable when collecting into Result<Container, Error>.
1 Needing to use par_bridge() comes up when using Rayon to process streaming data in parallel:
fn process(input: impl BufRead + Send) -> std::Result<Output, Error> {
let mut err = Ok(());
let output = lines
.input()
.scan(&mut err, until_err)
.par_bridge()
.map(|line| ... executed in parallel ... )
.reduce(|item| ... also executed in parallel ...);
err?;
...
Ok(output)
}
Again, equivalent effect cannot be trivially achieved by collecting into Result.

Handling nested .map() closure Result's
What if we have a .map() within a .map() within a .map()?
Here's an example for the specific case where the .map() operations are nested. The problem it solves is how to propagate a failure from the innermost closure while avoiding using .unwrap() which aborts the application.
This approach also enables using ? syntax at the outer layer to capture the error if one occurs, or unwrap the result to assign to a variable if no error occurred. ? can't otherwise be used from inside the closures.
.parse() as it's used below will return Result<T, ParseIntError>.
use std::error::Error;
const DATA: &str = "1 2 3 4\n5 6 7 8";
fn main() -> Result<(), Box<dyn Error>>
{
let data = DATA.lines().map(|l| l.split_whitespace()
.map(|n| n.parse() /* can fail */)
.collect())
.collect::<Result<Vec<Vec<i32>>, _>>()?;
println!("{:?}", data);
Ok(())
}
Note that the outer .collect::<..>() generic expression specifies Result<Vec<Vec<..>>. The inner .collect() will be producing Results, which are stripped away by the outer Result as it takes the Ok contents and produces the 2-D vector.
Without relying heavily on type inference, the inner .collect() generic expression would look like this:
.collect::<Result<Vec<i32>, _>>()) // <--- Inner.
.collect::<Result<Vec<Vec<i32>>, _>>()?; // <--- Outer.
Using the ? syntax, the variable, data, will be assigned this 2-D vector; or the main() function will return a parsing error that originated from within the inner closure.
output:
[[1, 2, 3, 4], [5, 6, 7, 8]]
Taking it a step further, parse results nested three levels deep can be handled this way.
type Vec3D<T, E> = Result<Vec<Vec<Vec<T>>>, E>;
const DATA: &str = "1 2 | 3 4\n5 6 | 7 8";
fn main() -> Result<(), Box<dyn Error>>
{
let data = DATA.lines()
.map(|a| a.split("|")
.map(|b| b.split_whitespace()
.map(|c| c.parse()) // <---
.collect())
.collect())
.collect::<Vec3D<i32,_>>()?;
println!("{:?}", data);
Ok(())
}
output:
[[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
Or if a number couldn't be parsed, we'd get:
Error: ParseIntError { kind: InvalidDigit }

This answer pertains to a pre-1.0 version of Rust and the required functions were removed
You can use std::result::fold function for this. It stops iterating after encountering the first Err.
An example program I just wrote:
fn main() {
println!("{}", go([1, 2, 3]));
println!("{}", go([1, -2, 3]));
}
fn go(v: &[int]) -> Result<Vec<int>, String> {
std::result::fold(
v.iter().map(|&n| is_positive(n)),
vec![],
|mut v, e| {
v.push(e);
v
})
}
fn is_positive(n: int) -> Result<int, String> {
if n > 0 {
Ok(n)
} else {
Err(format!("{} is not positive!", n))
}
}
Output:
Ok([1, 2, 3])
Err(-2 is not positive!)
Demo

How can I replace `.unwrap()` with `?` when mapping over an `ndarray::Array`?

I'd like to remove the use of .unwrap() from code which maps over an ndarray::Array and use a Result type for get_data() instead.
extern crate ndarray;
use ndarray::prelude::*;
use std::convert::TryFrom;
use std::error::Error;
fn get_data() -> Array2<usize> {
// In actual code, "a" comes from an external source, and the type
// is predetermined
let a: Array2<i32> = arr2(&[[1, 2, 3], [4, 5, 6]]);
let b: Array2<usize> = a.map(|x| usize::try_from(*x).unwrap());
b
}
fn main() -> Result<(), Box<dyn Error>> {
let a = get_data();
println!("{:?}", a);
Ok(())
}
For Vec, I've found this trick: How do I stop iteration and return an error when Iterator::map returns a Result::Err?.
However, this does not work with Arrays (collect isn't defined, and the semantics don't quite match up, since ndarray::Array defines a block of primitive types, which (AFAIU) can't hold Results).
Is there a nice way to handle this?

A native try_map implementation from ndarray would be ideal. It can short-circuit the computation and return as soon as an error occurs. It is also more composable.
Short of that, nothing wrong with a good old mutable sentinel variable:
extern crate ndarray;
use ndarray::prelude::*;
use std::convert::TryFrom;
use std::error::Error;
use std::num::TryFromIntError;
fn get_data() -> Result<Array2<usize>, TryFromIntError> {
let mut err = None;
let a: Array2<i32> = arr2(&[[1, 2, 3], [4, 5, 6]]);
let b: Array2<usize> = a.map(|&x| {
usize::try_from(x).unwrap_or_else(|e| {
err = Some(e);
Default::default()
})
});
err.map_or(Ok(b), Err)
}
fn main() -> Result<(), Box<dyn Error>> {
let a = get_data()?;
println!("{:?}", a);
Ok(())
}

Is there an idiomatic way to unwrap each returned Result from an iterator? [duplicate]

I have a function that returns a Result:
fn find(id: &Id) -> Result<Item, ItemError> {
// ...
}
Then another using it like this:
let parent_items: Vec<Item> = parent_ids.iter()
.map(|id| find(id).unwrap())
.collect();
How do I handle the case of failure inside any of the map iterations?
I know I could use flat_map and in this case the error results would be ignored:
let parent_items: Vec<Item> = parent_ids.iter()
.flat_map(|id| find(id).into_iter())
.collect();
Result's iterator has either 0 or 1 items depending on the success state, and flat_map will filter it out if it's 0.
However, I don't want to ignore errors, I want to instead make the whole code block just stop and return a new error (based on the error that came up within the map, or just forward the existing error).
How do I best handle this in Rust?

Result implements FromIterator, so you can move the Result outside and iterators will take care of the rest (including stopping iteration if an error is found).
#[derive(Debug)]
struct Item;
type Id = String;
fn find(id: &Id) -> Result<Item, String> {
Err(format!("Not found: {:?}", id))
}
fn main() {
let s = |s: &str| s.to_string();
let ids = vec![s("1"), s("2"), s("3")];
let items: Result<Vec<_>, _> = ids.iter().map(find).collect();
println!("Result: {:?}", items);
}
Playground

The accepted answer shows how to stop on error while collecting, and that's fine because that's what the OP requested. If you need processing that also works on large or infinite fallible iterators, read on.
As already noted, for can be used to emulate stop-on-error, but that is sometimes inelegant, as when you want to call max() or other method that consumes the iterator. In other situations it's next to impossible, as when the iterator is consumed by code in another crate, such as itertools or Rayon1.
Iterator consumer: try_for_each
When you control how the iterator is consumed, you can just use try_for_each to stop on first error. It accepts a closure that returns a Result, and try_for_each() will return Ok(()) if the closure returned Ok every time, and the first Err on the first error. This allows the closure to detect errors simply by using the ? operator in the natural way:
use std::{fs, io};
fn main() -> io::Result<()> {
fs::read_dir("/")?.try_for_each(|e| -> io::Result<()> {
println!("{}", e?.path().display());
Ok(())
})?;
// ...
Ok(())
}
If you need to maintain state between the invocations of the closure, you can also use try_fold. Both methods are implemented by ParallelIterator, so the same pattern works with Rayon.
try_for_each() does require that you control how the iterator is consumed. If that is done by code not under your control - for example, if you are passing the iterator to itertools::merge() or similar, you will need an adapter.
Iterator adapter: scan
The first attempt at stopping on error is to use take_while:
use std::{io, fs};
fn main() -> io::Result<()> {
fs::read_dir("/")?
.take_while(Result::is_ok)
.map(Result::unwrap)
.for_each(|e| println!("{}", e.path().display()));
// ...
Ok(())
}
This works, but we don't get any indication that an error occurred, the iteration just silently stops. Also it requires the unsightly map(Result::unwrap) which makes it seem like the program will panic on error, which is in fact not the case as we stop on error.
Both issues can be addressed by switching from take_while to scan, a more powerful combinator that not only supports stopping the iteration, but passes its callback owned items, allowing the closure to extract the error to the caller:
fn main() -> io::Result<()> {
let mut err = Ok(());
fs::read_dir("/")?
.scan(&mut err, |err, res| match res {
Ok(o) => Some(o),
Err(e) => {
**err = Err(e);
None
}
})
.for_each(|e| println!("{}", e.path().display()));
err?;
// ...
Ok(())
}
If needed in multiple places, the closure can be abstracted into a utility function:
fn until_err<T, E>(err: &mut &mut Result<(), E>, item: Result<T, E>) -> Option<T> {
match item {
Ok(item) => Some(item),
Err(e) => {
**err = Err(e);
None
}
}
}
...in which case we can invoke it as .scan(&mut err, until_err) (playground).
These examples trivially exhaust the iterator with for_each(), but one can chain it with arbitrary manipulations, including Rayon's par_bridge(). Using scan() it is even possible to collect() the items into a container and have access to the items seen before the error, which is sometimes useful and unavailable when collecting into Result<Container, Error>.
1 Needing to use par_bridge() comes up when using Rayon to process streaming data in parallel:
fn process(input: impl BufRead + Send) -> std::Result<Output, Error> {
let mut err = Ok(());
let output = lines
.input()
.scan(&mut err, until_err)
.par_bridge()
.map(|line| ... executed in parallel ... )
.reduce(|item| ... also executed in parallel ...);
err?;
...
Ok(output)
}
Again, equivalent effect cannot be trivially achieved by collecting into Result.

Handling nested .map() closure Result's
What if we have a .map() within a .map() within a .map()?
Here's an example for the specific case where the .map() operations are nested. The problem it solves is how to propagate a failure from the innermost closure while avoiding using .unwrap() which aborts the application.
This approach also enables using ? syntax at the outer layer to capture the error if one occurs, or unwrap the result to assign to a variable if no error occurred. ? can't otherwise be used from inside the closures.
.parse() as it's used below will return Result<T, ParseIntError>.
use std::error::Error;
const DATA: &str = "1 2 3 4\n5 6 7 8";
fn main() -> Result<(), Box<dyn Error>>
{
let data = DATA.lines().map(|l| l.split_whitespace()
.map(|n| n.parse() /* can fail */)
.collect())
.collect::<Result<Vec<Vec<i32>>, _>>()?;
println!("{:?}", data);
Ok(())
}
Note that the outer .collect::<..>() generic expression specifies Result<Vec<Vec<..>>. The inner .collect() will be producing Results, which are stripped away by the outer Result as it takes the Ok contents and produces the 2-D vector.
Without relying heavily on type inference, the inner .collect() generic expression would look like this:
.collect::<Result<Vec<i32>, _>>()) // <--- Inner.
.collect::<Result<Vec<Vec<i32>>, _>>()?; // <--- Outer.
Using the ? syntax, the variable, data, will be assigned this 2-D vector; or the main() function will return a parsing error that originated from within the inner closure.
output:
[[1, 2, 3, 4], [5, 6, 7, 8]]
Taking it a step further, parse results nested three levels deep can be handled this way.
type Vec3D<T, E> = Result<Vec<Vec<Vec<T>>>, E>;
const DATA: &str = "1 2 | 3 4\n5 6 | 7 8";
fn main() -> Result<(), Box<dyn Error>>
{
let data = DATA.lines()
.map(|a| a.split("|")
.map(|b| b.split_whitespace()
.map(|c| c.parse()) // <---
.collect())
.collect())
.collect::<Vec3D<i32,_>>()?;
println!("{:?}", data);
Ok(())
}
output:
[[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
Or if a number couldn't be parsed, we'd get:
Error: ParseIntError { kind: InvalidDigit }

This answer pertains to a pre-1.0 version of Rust and the required functions were removed
You can use std::result::fold function for this. It stops iterating after encountering the first Err.
An example program I just wrote:
fn main() {
println!("{}", go([1, 2, 3]));
println!("{}", go([1, -2, 3]));
}
fn go(v: &[int]) -> Result<Vec<int>, String> {
std::result::fold(
v.iter().map(|&n| is_positive(n)),
vec![],
|mut v, e| {
v.push(e);
v
})
}
fn is_positive(n: int) -> Result<int, String> {
if n > 0 {
Ok(n)
} else {
Err(format!("{} is not positive!", n))
}
}
Output:
Ok([1, 2, 3])
Err(-2 is not positive!)
Demo

How do I stop iteration and return an error when Iterator::map returns a Result::Err?

I have a function that returns a Result:
fn find(id: &Id) -> Result<Item, ItemError> {
// ...
}
Then another using it like this:
let parent_items: Vec<Item> = parent_ids.iter()
.map(|id| find(id).unwrap())
.collect();
How do I handle the case of failure inside any of the map iterations?
I know I could use flat_map and in this case the error results would be ignored:
let parent_items: Vec<Item> = parent_ids.iter()
.flat_map(|id| find(id).into_iter())
.collect();
Result's iterator has either 0 or 1 items depending on the success state, and flat_map will filter it out if it's 0.
However, I don't want to ignore errors, I want to instead make the whole code block just stop and return a new error (based on the error that came up within the map, or just forward the existing error).
How do I best handle this in Rust?

Result implements FromIterator, so you can move the Result outside and iterators will take care of the rest (including stopping iteration if an error is found).
#[derive(Debug)]
struct Item;
type Id = String;
fn find(id: &Id) -> Result<Item, String> {
Err(format!("Not found: {:?}", id))
}
fn main() {
let s = |s: &str| s.to_string();
let ids = vec![s("1"), s("2"), s("3")];
let items: Result<Vec<_>, _> = ids.iter().map(find).collect();
println!("Result: {:?}", items);
}
Playground

The accepted answer shows how to stop on error while collecting, and that's fine because that's what the OP requested. If you need processing that also works on large or infinite fallible iterators, read on.
As already noted, for can be used to emulate stop-on-error, but that is sometimes inelegant, as when you want to call max() or other method that consumes the iterator. In other situations it's next to impossible, as when the iterator is consumed by code in another crate, such as itertools or Rayon1.
Iterator consumer: try_for_each
When you control how the iterator is consumed, you can just use try_for_each to stop on first error. It accepts a closure that returns a Result, and try_for_each() will return Ok(()) if the closure returned Ok every time, and the first Err on the first error. This allows the closure to detect errors simply by using the ? operator in the natural way:
use std::{fs, io};
fn main() -> io::Result<()> {
fs::read_dir("/")?.try_for_each(|e| -> io::Result<()> {
println!("{}", e?.path().display());
Ok(())
})?;
// ...
Ok(())
}
If you need to maintain state between the invocations of the closure, you can also use try_fold. Both methods are implemented by ParallelIterator, so the same pattern works with Rayon.
try_for_each() does require that you control how the iterator is consumed. If that is done by code not under your control - for example, if you are passing the iterator to itertools::merge() or similar, you will need an adapter.
Iterator adapter: scan
The first attempt at stopping on error is to use take_while:
use std::{io, fs};
fn main() -> io::Result<()> {
fs::read_dir("/")?
.take_while(Result::is_ok)
.map(Result::unwrap)
.for_each(|e| println!("{}", e.path().display()));
// ...
Ok(())
}
This works, but we don't get any indication that an error occurred, the iteration just silently stops. Also it requires the unsightly map(Result::unwrap) which makes it seem like the program will panic on error, which is in fact not the case as we stop on error.
Both issues can be addressed by switching from take_while to scan, a more powerful combinator that not only supports stopping the iteration, but passes its callback owned items, allowing the closure to extract the error to the caller:
fn main() -> io::Result<()> {
let mut err = Ok(());
fs::read_dir("/")?
.scan(&mut err, |err, res| match res {
Ok(o) => Some(o),
Err(e) => {
**err = Err(e);
None
}
})
.for_each(|e| println!("{}", e.path().display()));
err?;
// ...
Ok(())
}
If needed in multiple places, the closure can be abstracted into a utility function:
fn until_err<T, E>(err: &mut &mut Result<(), E>, item: Result<T, E>) -> Option<T> {
match item {
Ok(item) => Some(item),
Err(e) => {
**err = Err(e);
None
}
}
}
...in which case we can invoke it as .scan(&mut err, until_err) (playground).
These examples trivially exhaust the iterator with for_each(), but one can chain it with arbitrary manipulations, including Rayon's par_bridge(). Using scan() it is even possible to collect() the items into a container and have access to the items seen before the error, which is sometimes useful and unavailable when collecting into Result<Container, Error>.
1 Needing to use par_bridge() comes up when using Rayon to process streaming data in parallel:
fn process(input: impl BufRead + Send) -> std::Result<Output, Error> {
let mut err = Ok(());
let output = lines
.input()
.scan(&mut err, until_err)
.par_bridge()
.map(|line| ... executed in parallel ... )
.reduce(|item| ... also executed in parallel ...);
err?;
...
Ok(output)
}
Again, equivalent effect cannot be trivially achieved by collecting into Result.

Handling nested .map() closure Result's
What if we have a .map() within a .map() within a .map()?
Here's an example for the specific case where the .map() operations are nested. The problem it solves is how to propagate a failure from the innermost closure while avoiding using .unwrap() which aborts the application.
This approach also enables using ? syntax at the outer layer to capture the error if one occurs, or unwrap the result to assign to a variable if no error occurred. ? can't otherwise be used from inside the closures.
.parse() as it's used below will return Result<T, ParseIntError>.
use std::error::Error;
const DATA: &str = "1 2 3 4\n5 6 7 8";
fn main() -> Result<(), Box<dyn Error>>
{
let data = DATA.lines().map(|l| l.split_whitespace()
.map(|n| n.parse() /* can fail */)
.collect())
.collect::<Result<Vec<Vec<i32>>, _>>()?;
println!("{:?}", data);
Ok(())
}
Note that the outer .collect::<..>() generic expression specifies Result<Vec<Vec<..>>. The inner .collect() will be producing Results, which are stripped away by the outer Result as it takes the Ok contents and produces the 2-D vector.
Without relying heavily on type inference, the inner .collect() generic expression would look like this:
.collect::<Result<Vec<i32>, _>>()) // <--- Inner.
.collect::<Result<Vec<Vec<i32>>, _>>()?; // <--- Outer.
Using the ? syntax, the variable, data, will be assigned this 2-D vector; or the main() function will return a parsing error that originated from within the inner closure.
output:
[[1, 2, 3, 4], [5, 6, 7, 8]]
Taking it a step further, parse results nested three levels deep can be handled this way.
type Vec3D<T, E> = Result<Vec<Vec<Vec<T>>>, E>;
const DATA: &str = "1 2 | 3 4\n5 6 | 7 8";
fn main() -> Result<(), Box<dyn Error>>
{
let data = DATA.lines()
.map(|a| a.split("|")
.map(|b| b.split_whitespace()
.map(|c| c.parse()) // <---
.collect())
.collect())
.collect::<Vec3D<i32,_>>()?;
println!("{:?}", data);
Ok(())
}
output:
[[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
Or if a number couldn't be parsed, we'd get:
Error: ParseIntError { kind: InvalidDigit }

This answer pertains to a pre-1.0 version of Rust and the required functions were removed
You can use std::result::fold function for this. It stops iterating after encountering the first Err.
An example program I just wrote:
fn main() {
println!("{}", go([1, 2, 3]));
println!("{}", go([1, -2, 3]));
}
fn go(v: &[int]) -> Result<Vec<int>, String> {
std::result::fold(
v.iter().map(|&n| is_positive(n)),
vec![],
|mut v, e| {
v.push(e);
v
})
}
fn is_positive(n: int) -> Result<int, String> {
if n > 0 {
Ok(n)
} else {
Err(format!("{} is not positive!", n))
}
}
Output:
Ok([1, 2, 3])
Err(-2 is not positive!)
Demo

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Convert iterator of Result to Result<Vec<_>> [duplicate] - rust

Related

Break the function map() [duplicate]

How do I return from top function inside a closure? [duplicate]

How can I replace `.unwrap()` with `?` when mapping over an `ndarray::Array`?

Is there an idiomatic way to unwrap each returned Result from an iterator? [duplicate]

How do I stop iteration and return an error when Iterator::map returns a Result::Err?

Categories

Resources