Nice way to map with potential failure [duplicate] - rust

This question already has answers here:
How do I stop iteration and return an error when Iterator::map returns a Result::Err?
(4 answers)
Closed 6 years ago.
I'm trying to parse a series of Json objects with potential failures that cancel the whole function.
Ideally, I'd do something like:
fn .... -> Result<Vec<Video>, YoutubeParseError> {
...
let videos = try!(doc.find("items").
and_then(Json::as_array).
ok_or(YoutubeParseError));
Ok(videos.into_iter().
map(|item| try!(json_to_video(item))).
collect())
}
But of course try doesn't escape the map() on error and instead of Result<Vec<Video>,_>, I get Vec<Result<Video,_>>. I could rewrite this as manual iteration adding elements into a new vec, but I feel like I'm missing some simpler way of handling this.
Is there some existing function that would get me from Iter<Result<T>> to Result<Vec<T>,_> easily?

In functional programming languages you can treat options and results as containers and Rust is similar, so you can map / flat_map over them. You could do this with flat_map. If videos is already a vector, you can just test for expected number of Ok's against a flat_mapped length to decide whether to return Ok.
However, you should try to keep things lazy and not continue parsing after the first failure. take_while would be an option here. Either way, you will need to track if you saw a parse_failure along the way. Something like below works - it demonstrates how flat_map drops Errors, but it parses more than necessary. You could also use a .filter and then .map to get the parse result
fn get_videos(test: &Vec<&str>) -> Result<Vec<u32>, &'static str> {
let videos = ...
let expected = videos.len();
let extracted = v.into_iter().flat_map(|x| json_to_video(x)).collect();
if extracted.len() == expected {
Ok(extracted)
} else {
Err("not_ok")
}
}
Here's an option to do it lazily -
let extracted = videos.map(|x|json_to_video(x))
.take_while(|x|x.is_ok())
.map(|x|x.ok().unwrap())
.collect()
You can call unwrap as you dropped everything starting at first failure. Now you return Ok if extracted.len() == videos.len()

Related

Re-use already advanced iterator for different function

While iterating over lines in a file I need to first do "task_A" and then "task_B". The first few lines there is some data that I need to put into some data structure (task_A) and after that the lines describe how the data inside of the data structure is manipulated (task_B). Right now I use a for-loop with enumerate and if-else statements that switch depending on which file number:
let file = File::open("./example.txt").unwrap();
let reader = BufReader::new(file);
for (i, lines) in reader.lines().map(|l| l.unwrap()).enumerate() {
if i < n {
do_task_a(&lines);
} else {
do_task_b(&lines);
}
}
There is also the take_while()-method for iterators. But this only solves one part. Ideally I would pass the iterator for n steps to one function and after that to another function. I want to have a solution that only needs to iterate over the file one time.
(For anyone wondering: I want a more elegant solution for 5th day of Advent of Code 2022 Is there a way to do that? To "re-use" the iterator when it is already advanced n steps?
Looping or using an iterator adapter will consume an iterator. But if I is an iterator then so is &mut I!
You can use that instance to partially iterate through the iterator with one adapter and then continue with another. The first use consumes only the mutable reference, but not the iterator itself. For example using take:
let mut it = reader.lines().map(|l| l.unwrap());
for lines in (&mut it).take(n) {
do_task_a(&lines);
}
for lines in it {
do_task_b(&lines);
}
But I think your original code is still completely fine.

How is Rust persuaded to do an explicit conversion rather than try to save references? [duplicate]

This question already has answers here:
In Rust, what's the difference between "shadowing" and "mutability"?
(1 answer)
Proper way to return a new string in Rust
(2 answers)
Return local String as a slice (&str)
(7 answers)
Why can't I return an &str value generated from a String?
(1 answer)
How do I make format! return a &str from a conditional expression?
(3 answers)
Closed 2 years ago.
How is the following made to work (safely)? There are serious downsides to using 'static.
fn whatever() {
let mut myslice = "goodbye"; //variable that lives, but not necessarily the whole program!
print!("{}", myslice);
{
let mystring = "hello".to_string();
myslice = &*mystring;
}
print!("{}", myslice);
}
The second print should produce 'hello'.
I encounter this problem commonly in lots of forms.
The open bracket could represent multiple things, like calling a function or using an if statement.
E.g.'If there are problems with the value in myslice and things are not working properly. {'
Working out the replacement, (which proved in the above example to be 'hello') is frequently no easy or quick matter, and involves code not to be touched unless it was proved there was a problem. As is normal in Rust, there are many alternatives to &*mystring (&mystring[..], : &str on the left, &*mystring, mystring.as_str(), etc.) but none explicitly manipulate the perfectly available, mutable and live long enough variable as if I had typed let found = "hello".to_string; myslice = &found;' outside the curly brackets. I have tried.clone()` in various places. Why does Rust make such a meal of this simple request? Obviously I am prepared to pay the minuscule processor time to actually do the
request.
I would like a general solution. However, it seems the above problem is explicitly with the type 'String'. e.g. let found = "hello"; let myslice = found; seems to work, even inside the brackets. (found is now &str - it does not seemed ever 'borrowed'.) Is the problem directly or indirectly tied up with not knowing length at compile time? Unfortunately and frequently this is not in my control, I have to use what crates decide to give.

Getting query string from Window object in WebAssembly in Rust

Context: I am learning Rust & WebAssembly and as a practice exercise I have a project that paints stuff in HTML Canvas from Rust code. I want to get the query string from the web request and from there the code can decide which drawing function to call.
I wrote this function to just return the query string with the leading ? removed:
fn decode_request(window: web_sys::Window) -> std::string::String {
let document = window.document().expect("no global window exist");
let location = document.location().expect("no location exists");
let raw_search = location.search().expect("no search exists");
let search_str = raw_search.trim_start_matches("?");
format!("{}", search_str)
}
It does work, but it seems amazingly verbose given how much simpler it would be in some of the other languages I have used.
Is there an easier way to do this? Or is the verbosity just the price you pay for safety in Rust and I should just get used to it?
Edit per answer from #IInspectable:
I tried the chaining approach and I get an error of:
temporary value dropped while borrowed
creates a temporary which is freed while still in use
note: consider using a `let` binding to create a longer lived value rustc(E0716)
It would be nice to understand that better; I am still getting the niceties of ownership through my head. Is now:
fn decode_request(window: Window) -> std::string::String {
let location = window.location();
let search_str = location.search().expect("no search exists");
let search_str = search_str.trim_start_matches('?');
search_str.to_owned()
}
which is certainly an improvement.
This question is really about API design rather than its effects on the implementation. The implementation turned out to be fairly verbose mostly due to the contract chosen: Either produce a value, or die. There's nothing inherently wrong with this contract. A client calling into this function will never observe invalid data, so this is perfectly safe.
This may not be the best option for library code, though. Library code usually lacks context, and cannot make a good call on whether any given error condition is fatal or not. That's a question client code is in a far better position to answer.
Before moving on to explore alternatives, let's rewrite the original code in a more compact fashion, by chaining the calls together, without explicitly assigning each result to a variable:
fn decode_request(window: web_sys::Window) -> std::string::String {
window
.location()
.search().expect("no search exists")
.trim_start_matches('?')
.to_owned()
}
I'm not familiar with the web_sys crate, so there is a bit of guesswork involved. Namely, the assumption, that window.location() returns the same value as the document()'s location(). Apart from chaining calls, the code presented employs two more changes:
trim_start_matches() is passed a character literal in place of a string literal. This produces optimal code without relying on the compiler's optimizer to figure out, that a string of length 1 is attempting to search for a single character.
The return value is constructed by calling to_owned(). The format! macro adds overhead, and eventually calls to_string(). While that would exhibit the same behavior in this case, using the semantically more accurate to_owned() function helps you catch errors at compile time (e.g. if you accidentally returned 42.to_string()).
Alternatives
A more natural way to implement this function is to have it return either a value representing the query string, or no value at all. Rust provides the Option type to conveniently model this:
fn decode_request(window: web_sys::Window) -> Option<String> {
match window
.location()
.search() {
Ok(s) => Some(s.trim_start_matches('?').to_owned()),
_ => None,
}
}
This allows a client of the function to make decisions, depending on whether the function returns Some(s) or None. This maps all error conditions into a None value.
If it is desirable to convey the reason for failure back to the caller, the decode_request function can choose to return a Result value instead, e.g. Result<String, wasm_bindgen::JsValue>. In doing so, an implementation can take advantage of the ? operator, to propagate errors to the caller in a compact way:
fn decode_request(window: web_sys::Window) -> Result<String, wasm_bindgen::JsValue> {
Ok(window
.location()
.search()?
.trim_start_matches('?')
.to_owned())
}

Is there a way to continue within a futures for_each stream?

I am doing a for_each loop over a stream of futures received via a mspc::Receiver
rx.for_each(move |trade| {
if something_true {
continue;
}
// down here I have computation logic which returns a future
});
I would like to do something like the logic above.
Of course, I could just do an if/else statement but both branches have to return the same type of future, which is hard for me to do as the future I generate in my computation logic is a long chain of messy futures. Which got me thinking if there is actually a simple way of approaching this, like a continue or some sort?
Let's solve the two issues separately. First, the easiest: if your chain of futures inside for_each() is not homogeneous (they rarely will be), consider returning a boxed future (i.e. Box<dyn Future<Item = _, Error = _>>). You may need to typecast the closure return to that, as the compiler will sometimes not get what you are trying to do.
Now, for the "continue if condition" - this typically means you're filtering out certain elements of the stream, which indicates that the better function to call may include filter() or an intermediate state - i.e. returning a future whose item type is Option<_>, and then filtering based on that in the next member of the chain.
No, you cannot. continue is syntax that is only accepted by the core Rust language and crates cannot make use of it.
You could instead return early:
rx.for_each(move |trade| {
if true {
return future::ok(());
}
future::ok(())
});
both branches have to return the same type of future
Use Either or a boxed trait object
rx.for_each(move |trade| {
if true {
return Either::A(future::ok(()));
}
Either::B(future::lazy(|| future::ok(())))
});
See also:
How do I conditionally return different types of futures?
I'd probably move the condition to the stream such that the for_each never sees it:
rx.filter(|trade| true)
.for_each(move |trade| future::ok(()));

Are there any nice cases where we should use `unwrap`? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Since using unwrap may be problematic because it crashes in the error scenario, it may be considered as dangerous usage.
What if I am hundred percent sure that it will not crash, like in the following scenarios:
if option.is_some() {
let value = option.unwrap();
}
if result.is_ok() {
let result_value = result.unwrap();
}
Since we already checked the Result and Option there will be no crash with the unwrap() usage. However, we could have used match or if let. In my opinion, either match or if let usage is more elegant.
Let's focus on Result; I'll go back to Option at the end.
The purpose of Result is to signal a result which may succeed or fail with an error. As such, any use of it should fall into this category. Let's ignore the cases where a crate returns Result for impossible-to-fail operations.
By doing what you are doing (checking if result.is_ok() then extracting the value), you're effectively doing the same thing twice. The first time, you're inspecting the content of the Result, and the second, you're checking and extracting unsafely.
This could indeed have been done with a match or map, and both would have been more idiomatic than an if. Consider this case for a moment:
You have an object implementing the following trait:
use std::io::{Error, ErrorKind};
trait Worker {
fn hours_left(&self) -> Result<u8, Error>;
fn allocate_hours(&mut self, hours: u8) -> Result<u8, Error>;
}
We're going to assume hours_left() does exactly what it says on the tin. We'll also assume we have a mutable borrow of Worker. Let's implement allocate_hours().
In order to do so, we'll obviously need to check if our worker has extra hours left over to allocate. You could write it similar to yours:
fn allocate_hours(&mut self, hours: u8) {
let hours_left = self.hours_left();
if (hours_left.is_ok()) {
let remaining_hours = hours_left.unwrap();
if (remaining_hours < hours) {
return Err(Error::new(ErrorKind::NotFound, "Not enough hours left"));
}
// Do the actual operation and return
} else {
return hours_left;
}
}
However, this implementation is both clunky and inefficient. We can simplify this by avoiding unwrap and if statements altogether.
fn allocate_hours(&mut self, hours: u8) -> Result<u8, Error> {
self.hours_left()
.and_then(|hours_left| {
// We are certain that our worker is actually there to receive hours
// but we are not sure if he has enough hours. Check.
match hours_left {
x if x >= hours => Ok(x),
_ => Err(Error::new(ErrorKind::NotFound, "Not enough hours")),
}
})
.map(|hours_left| {
// At this point we are sure the worker has enough hours.
// Do the operations
})
}
We've killed multiple birds with one stone here. We've made our code more readable, easier to follow and we've removed a whole bunch of repeated operations. This is also beginning to look like Rust and less like PHP ;-)
Option is similar and supports the same operations. If you want to process the content of either Option or Result and branch accordingly, and you're using unwrap, there are so many pitfalls you'll inevitably fall into when you forget you unwrapped something.
There are genuine cases where your program should barf out. For those, consider expect(&str) as opposed to unwrap()
In many, many cases you can avoid unwrap and others by more elegant means. However, I think there are situations where it is the correct solution to unwrap.
For example, many methods in Iterator return an Option. Let us assume that you have a nonempty slice (known to be nonempty by invariants) and you want to obtain the maximum, you could do the following:
assert!(!slice.empty()); // known to be nonempty by invariants
do_stuff_with_maximum(slice.iter().max().unwrap());
There are probably several opinions regarding this, but I would argue that using unwrap in the above scenario is perfectly fine - in the presence of the preceeding assert!.
My guideline is: If the parameters I am dealing with are all coming from my own code, not interfacing with 3rd party code, possibly assert!ing invariants, I am fine with unwrap. As soon as I am the slightest bit unsure, I resort to if, match, map and others.
Note that there is also expect which is basically an "unwrap with a comment printed in the error case". However, I have found this to be not-really-ergonomic. Moreover, I found the backtraces a bit hard to read if unwrap fails. Thus, I currently use a macro verify! whose sole argument is an Option or Result and that checks that the value is unwrapable. It is implemented like this:
pub trait TVerifiableByVerifyMacro {
fn is_verify_true(&self) -> bool;
}
impl<T> TVerifiableByVerifyMacro for Option<T> {
fn is_verify_true(&self) -> bool {
self.is_some()
}
}
impl<TOk, TErr> TVerifiableByVerifyMacro for Result<TOk, TErr> {
fn is_verify_true(&self) -> bool {
self.is_ok()
}
}
macro_rules! verify {($e: expr) => {{
let e = $e;
assert!(e.is_verify_true(), "verify!({}): {:?}", stringify!($e), e)
e
}}}
Using this macro, the aforementioned example could be written as:
assert!(!slice.empty()); // known to be nonempty by invariants
do_stuff_with_maximum(verify!(slice.iter().max()).unwrap());
If I can't unwrap the value, I get an error message mentioning slice.iter().max(), so that I can search my codebase quickly for the place where the error occurs. (Which is - in my experience - faster than looking through the backtrace for the origin of the error.)

Resources