How do I continue even when a function errors? - rust

I'm using reqwest and scraper to write a simple web scraper in rust. One section of my code is given a link "next_page" that may or may not be valid. I want to try to access the link, and if there's an error, print the error and continue to the next link. Here's what I tried to do:
let next_page_response = reqwest::get(next_page).await;
let next_page_response = match next_page_response {
Ok(response) => response,
Err(error) => println!("WARN: Problem getting the url '{}'. \
The error was {:?}", next_page, error),
};
This code is wrapped in a loop, and next_page changes every iteration.
This doesn't work, rustc gives the error error[E0308]: 'match' arms have incompatible types. I suppose this makes sense, in the first arm the expression becomes a Response, whereas in the second arm it becomes a (). However, if I change println! to panic!, the code compiles.
Questions:
How can I acknowledge an error and then just continue?
Why does panic! work when println! doesn't?
Full code, for the curious.

As you allude to in the original post, the issue is that the return values of the two branches don't match. And as kmdreko mentions panic! works since it has a return type that can be used anywhere.
So if you want to avoid the panic you need to make the return values from each branch match.
One way is to have both arms return (). The code can be pretty simple if you put the processing of the successful response into the body of the match.
pub fn process_response(response: &Response) {
// ...
}
for next_page in pages {
let next_page_response = reqwest::get(next_page).await;
match next_page_response {
Ok(response) => process_response(response),
Err(error) => println!("WARN: Problem getting the url '{}'. \
The error was {:?}", next_page, error),
};
}
An alternate is to have both arms return an Option that you use later.
(In this case, it makes the code longer and uglier in my opinion, but there can be cases where it is useful). It could look something like this:
pub fn process_response(response: &Response) {
// ...
}
for next_page in pages {
let next_page_response = reqwest::get(next_page).await;
let next_page_response = match next_page_response {
Ok(response) => Some(response),
Err(error) => {
println!("WARN: Problem getting the url '{}'. \
The error was {:?}", next_page, error);
None
}
};
if let Some(response) = next_page_response {
process_response(response)
}
}

Related

How to properly handle a tokio::try_join! if one of the tasks panics and cleanly abort?

For the two async functions that I am passing to my try_join!(), let's say there's 3 ways that they could panic.
I'm trying to use a set_hook to catch the errors but I'm not sure how to do a match statement on the panics so I can display a custom error message for each of the ways that they can panic. It looks like set_hook takes a Box(Any) (?), so I was wondering if there was a way to check the type of Error. Basically I just don't want to do regex on the ErrString.
I'm also not sure what the best way to abort the runtime within each match branch. I'm currently using std::process::exit(0).
code looks like:
set_hook(Box::new(|panic_info| {
println!("Thread panicked! {}", panic_info);
// std::process::exit(0);
}));
let (result1, result2) = tokio::try_join!(func1, func2); // code that could panic
I want to be able to do something like
set_hook(Box::new(|panic_info| {
match panic_info {
panic_type_1 => { println!("X was invalid, please try using valid X") }
panic_type_2 => { println!("Y was invalid, please try using valid Y") }
panic_type_3 => { println!("Z was invalid, please try using valid Z") }
_ => { println!("Something else happened: {}", panic_info) }
}
}));
let (result1, result2) = tokio::try_join!(func1, func2); // code that could panic
Don't bother with set_hook. The future that tokio::task::spawn* returns resolves to a Result with a JoinError type, which has a [try_]into_panic to get the boxed object that was passed to panic.
The panic message is stored as a Box<dyn Any> which has tons of methods for downcasting it into various types.

How to log error and return/continue a result/option

I have come across many instances in that I need to write a code similar to this snippet. I was wondering if there's a shorter way to do it?
loop {
let res = match do() {
Ok(res) => res,
Err(e) => {
eprintln!("Error: {}", e);
continue;
}
}
// Do stuff with `res` ...
}
or
fn some_fn() {
let res = match do() {
Some(res) => res,
None => {
eprintln!("Error: not found");
return;
}
}
// Do stuff with `res` ...
}
I was looking for something like the ? keyword to return early with errors but in a case where the function returns nothing and I just want to return nothing if the result is None/Error.
Maybe something like this:
loop {
do().unwrap_or_log(|e| eprintln("{}", e).continue // :D
}
And consider do() is never gonna be this short. It's probably a chain of a few function calls which is already too long.
Maybe the way I'm doing it is the only way or maybe I'm doing something wrong which makes to do this and I shouldn't be doing it!?
This is pretty much how it is. However nice the many chaining functions are, they cannot affect the control flow where they are used.
One suggestion I may make: if you have many fallible operations that need to be logged and continued in an infallible context, you could move those into a single fallible function that you then can log and skip any errors all at once.
You aren't the only person to have complaints though and others have suggested changes to make this flow less bothersome.
There's the let else proposal, which I believe is implemented in the nightly compiler. It just needs to be documented and stabilized. It would look like this (playground):
let Ok(res) = do_this() else { continue };
Or perhaps a postfix macro proposal could be implemented eventually, which may look like this:
let res = do_this().unwrap_or!{ continue };
I'll note though that neither of these give access to the Err(e) value. So you'd still need a custom method for logging.

How can I accept invalid or self-signed SSL certificates in Rust futures reqwest?

My code looks like the following:
let fetches = futures::stream::iter(
hosts.into_iter().map(|url| {
async move {
match reqwest::get(&url).await {
// Ok and Err statements here!
}
But, the problem here is that it gives an error for URLs with invalid or self-signed SSL certificate. So, I tried to do the following:
let fetches = futures::stream::iter(
hosts.into_iter().map(|url| {
async move {
match reqwest::Client::builder().danger_accept_invalid_certs(true).build().unwrap().get(&url).await {
// Ok and Err statements here!
}
When I try to build it with Cargo, it says "error[E0277]: `RequestBuilder` is not a future".
So, how can I make my code accept invalid certificates?
Unlike the top-level get() function, which returns a Response, the Client::get() method which you call in the second snippet, returns a RequestBuilder, which you must send() to actually communicate.
Adding the missing send() allows the code to compile (playgropund):
fn main() {
let hosts: Vec<String> = vec![];
let fetches = futures::stream::iter(hosts.into_iter().map(|url| async move {
match reqwest::Client::builder()
.danger_accept_invalid_certs(true)
.build()
.unwrap()
.get(&url)
.send()
.await
{
Ok(x) => x,
Err(x) => panic!(),
}
}));
}

Getting current path when handling rejections

I'd like to know how it would be possible to get HTTP path in Warp's rejection handler? I've got the following rejection method:
pub(crate) async fn handle(err: Rejection) -> Result<impl Reply, Infallible> {
let response = if err.is_not_found() {
HttpApiProblem::with_title_and_type_from_status(StatusCode::NOT_FOUND)
} else if let Some(e) = err.find::<warp::filters::body::BodyDeserializeError>() {
HttpApiProblem::with_title_and_type_from_status(StatusCode::BAD_REQUEST)
.set_detail(format!("{}", e))
} else if let Some(e) = err.find::<Error>() {
handle_request_error(e)
} else if let Some(e) = err.find::<warp::reject::MethodNotAllowed>() {
HttpApiProblem::with_title_and_type_from_status(StatusCode::METHOD_NOT_ALLOWED)
.set_detail(format!("{}", e))
} else {
error!("handle_rejection catch all: {:?}", err);
HttpApiProblem::with_title_and_type_from_status(StatusCode::INTERNAL_SERVER_ERROR)
};
Ok(response.to_hyper_response())
}
For instance, I'd call curl localhost:1234/this-aint-valid-path/123 and would like to have access to /this-aint-valid-path/123 for logging purposes as well as returning this as part of the error response.
It loos like the method err.is_not_found() checks whether the reason matches Reason::NotFound, which is an enumeration variant with no parameters. Rejection structs have no additional metadata beyond their reason, so the code in your question cannot be modified to solve your problem. It is however possible to create a custom reason with whatever metadata you want. The method you're looking for to create that Rejection object is called custom, and the docs for it can be found here.

Error handling and conditional chaining of Actix actors

This is my first attempt at writing a small webservice with rust, using actix-web.
The code below is a request handler that is intended to do three things, insert an entry in the database, send an email if that db call was successful, and then return a json payload as the response.
data.dal (database call) and data.email_service are references to Actors.
The issue: is I am unable to capture the error returned by data.dal. Any attempt to reconfigure the below code seems to give me an error stating the compiler wasn't able to find a conversion from Actix Mailbox to [Type].
Is there an alternate/better way to rewrite this? Basically when the request is issued, I'd like to be able to call Actor A. And if the result from A is Ok then call Actor B. If the results from both are okay return a JSON payload. If either A or B return an error (can have different error types), return an custom error message.
pub fn register_email(
invitation: Json<EmailInvitationInput>,
data: web::Data<AppState>,
) -> impl Future<Item=HttpResponse, Error=Error> {
let m = dal::queries::CreateEmailInvitation { email: invitation.email.clone() };
data.dal.send(m)
.from_err()
.and_then(move |res| {
let invite = res.unwrap();
let email_input = email::SendLoginLink {
from: "from_email".to_string(),
to: "to_email".to_string(),
};
data.email_service.send(email_input)
.from_err()
.and_then(move |res| match res {
Ok(_) => {
Ok(HttpResponse::Ok().json(EmailInvitationOutput { expires_at: invite.expires_at }))
}
Err(err) => {
debug!("{:#?}", err);
Ok(ServiceError::InternalServerError.error_response())
}
})
})
}
What I usually do is to have an Error type that agglomerates all different errors, the coercion to this type can be achieved implicitly by declaring the appropriate From implementations and what you are doing from_err() but here I am being explicit:
I haven't tested this code snippet but this is how I have done it in projects I'm working on that use Actix:
data.dal.send(m)
.map_err(Error::Mailbox)
.and_then(|res| res.map_err(Error::Service))
.and_then(move |invite| {
let email_input = email::SendLoginLink {
from: "from_email".to_string(),
to: "to_email".to_string(),
};
data.email_service.send(email_input)
.map_err(Error::Mailbox)
.and_then(|res| res.map_err(Error::Service))
.and_then(move |res| HttpResponse::Ok().json(EmailInvitationOutput { expires_at: invite.expires_at }))
})
.or_else(|err| {
debug!("{:#?}", err);
ServiceError::InternalServerError.error_response()
})
(I'm assuming ServiceError implements IntoFuture just like HttpResponse does)

Resources