Can I borrow values into a closure instead of moving them? - rust

I'm writing a GET method for a server application written in actix-web. LMDB is the database I use, its transactions need to be aborted or committed before the end of their lifetime.
In order to avoid a bunch of nested matching, I tried using map_err on all of the functions that return a result. There I try to abort the transaction, but the transaction gets moved into the closure instead of being borrowed.
Is there any way to instead borrow the transaction into the closure, or do I have to bite the bullet and write a bunch of nested matches? Essentially, what's the most ergonomic way to write this function?
Example code (see comments next to txn.abort()):
pub async fn get_user(db: Data<Database>, id: Identity) -> Result<Json<User>, Error> {
let username = id.identity().ok_or_else(|| error::ErrorUnauthorized(""))?;
let txn = db
.env
.begin_ro_txn()
.map_err(|_| error::ErrorInternalServerError(""))?;
let user_bytes = txn.get(db.handle_users, &username).map_err(|e| {
txn.abort(); // txn gets moved here
match e {
lmdb::Error::NotFound => {
id.forget();
error::ErrorUnauthorized("")
}
_ => error::ErrorInternalServerError(""),
}
})?;
let user: User = serde_cbor::from_slice(user_bytes).map_err(|_| {
txn.abort(); // cannot use txn here as is was moved
error::ErrorInternalServerError("")
})?;
txn.abort(); // cannot use txn here as is was moved
Ok(Json(user))
}

Sadly, in my case it is impossible to borrow the value into the closure, because abort consumes the transaction. (Thanks to #vkurchatkin for explaining that)
In case anyone is interested, I've worked out a solution that satisfies me regardless of the issue. It was possible for me to avoid nesting a bunch of matches.
I moved all of the logic that works on the transaction into a separate function and then delayed the evaluation of the function Result until after running txn.abort() (see comments):
pub async fn get_user(db: Data<Database>, id: Identity) -> Result<Json<User>, Error> {
let username = id.identity().ok_or_else(|| error::ErrorUnauthorized(""))?;
let txn = db
.env
.begin_ro_txn()
.map_err(|_| error::ErrorInternalServerError(""))?;
let user = db_get_user(&db, &txn, &id, &username); // Execute separate function but do not evaluate the function Result yet, notice missing question mark operator!
txn.abort(); // Abort the transaction after running the code. (Doesn't matter if it was successful or not. This consumes the transaction and it cannot be used anymore.)
Ok(Json(user?)) // Now evaluate the Result using the question mark operator.
}
// New separate function that uses the transaction.
fn db_get_user(
db: &Database,
txn: &RoTransaction,
id: &Identity,
username: &str,
) -> Result<User, Error> {
let user_bytes = txn.get(db.handle_users, &username).map_err(|e| match e {
lmdb::Error::NotFound => {
id.forget();
error::ErrorUnauthorized("")
}
_ => error::ErrorInternalServerError(""),
})?;
serde_cbor::from_slice(user_bytes).map_err(|_| error::ErrorInternalServerError(""))
}

Related

Extracting the saved local variables of the generator data structure of Future

From the "Rust for Rustaceans" book, I read that "... every await or yield is really a return from the function. After all, there are several local variables in the function, and it’s not clear how they’re restored when we resume later on. This is where the compiler-generated part of generators comes into play. The compiler transparently injects code to persist those variables into and read them from the generator’s associated data structure, rather than the stack, at the time of execution. So if you declare, write to, or read from some local variable a, you are really operating on something akin to self.a"
Say I have something like this:
use futures::future::{AbortHandle, Abortable};
use tokio::{time::sleep};
use std::{time::Duration};
async fn echo(s: String, times_to_repeat: u32) {
let mut vec = Vec::new();
for n in 0..times_to_repeat {
println!("Iteration {} Echoing {}", n, s.clone());
vec.push(s.clone());
sleep(Duration::from_millis(10)).await;
}
}
async fn child(s: String) {
echo(s, 100).await
}
#[tokio::main]
async fn main() {
let (abort_handle, abort_registration) = AbortHandle::new_pair();
let result_fut = Abortable::new(child(String::from("Hello")), abort_registration);
tokio::spawn(async move {
sleep(Duration::from_millis(100)).await;
abort_handle.abort();
});
result_fut.await.unwrap();
}
After abort has been called, how do I save/ serialize variables like n and vec? Is there a way to reach within the inside of the data structure of the generator that is generated from Future?

cannot return value referencing local data

I'm new to rust. The get_x509 function below creates a compiler warning "cannot return value referencing local data pem.contents" . I think I understand why - because the return value references pem.contents which is only in scope for that function - but I've not been able to work out how to get it to work.
The x509 functions in the code below come from the x509_parser crate
use x509_parser::prelude::*;
fn main() {
let cert = "";
get_x509(cert);
}
fn get_x509(cert: &str) -> X509Certificate {
let res_pem = parse_x509_pem(cert.as_bytes());
let x509_cert = match res_pem {
Ok((_, pem)) => {
let res_cert = parse_x509_certificate(&pem.contents);
match res_cert {
Ok((_, certificate)) => certificate,
Err(_err) => {
panic!("Parse failed")
}
}
}
Err(_err) => {
panic!("Parse failed")
}
};
return x509_cert;
}
I've tried making the cert variable a static value. If I inline the above code in the main() function, it works (but I have to match on &res_pem instead of res_pem).
According to x509-parser-0.14.0/src/certificate.rs, both the parameter and the return value of parse_x509_certificate have a lifetime 'a associated with them. One way to solve the problem is to divide get_x509 into two functions, and you can somehow avoid local reference in the second function which calls parse_x509_certificate.
The following code compiles (but will panic at runtime since cert is empty):
fn main() {
let cert = "";
let pem = get_x509_pem(cert);
get_x509(&pem); // Return value is unused.
}
use x509_parser::prelude::*;
fn get_x509_pem(cert: &str) -> Pem {
let (_, pem) = parse_x509_pem(cert.as_bytes()).expect("Parse failed");
pem
}
fn get_x509(pem: &Pem) -> X509Certificate {
let x509_cert = match parse_x509_certificate(&pem.contents) {
Ok((_, certificate)) => certificate,
Err(_err) => {
panic!("Parse failed")
}
};
x509_cert
}
The issue here, just as you've said, is that you have something that only lives in the context of the function, and you want to return a reference to it (or to some parts of it). But when the function execution is finished, the underlying data is removed, hence you would return a dangling reference - and Rust prevents this.
The way to go around this is (well illustrated by the answer of Simon Smith) to return the data you'd like to reference instead of just returning the reference. So in your case, you want to return the whole resp_pem object and then do any further data extraction.
Reading the documentation of the library, you seem to be in an unlucky situation where there is no way around moving res_pem out of the function into a static space, since parse_x509_pem returns owned data, and X509Certificate contains references. Hence, the lifetime of the returned certificate has to outlive the function, but the object you reference (res_pem) is owned by the function and is removed when the execution of the function is finished.

Waiting on multiple futures borrowing mutable self

Each of the following methods need (&mut self) to operate. The following code gives the error.
cannot borrow *self as mutable more than once at a time
How can I achieve this correctly?
loop {
let future1 = self.handle_new_connections(sender_to_connector.clone());
let future2 = self.handle_incoming_message(&mut receiver_from_peers);
let future3 = self.handle_outgoing_message();
tokio::pin!(future1, future2, future3);
tokio::select! {
_=future1=>{},
_=future2=>{},
_=future3=>{}
}
}
You are not allowed to have multiple mutable references to an object and there's a good reason for that.
Imagine you pass an object mutably to 2 different functions and they edited the object out of sync since you don't have any mechanism for that in place. then you'd end up with something called a race condition.
To prevent this bug rust allows only one mutable reference to an object at a time but you can have multiple immutable references and often you see people use internal mutability patterns.
In your case, you want data not to be able to be modified by 2 different threads at the same time so you'd wrap it in a Lock or RwLock then since you want multiple threads to be able to own this value you'd wrap that in an Arc.
here you can read about interior mutability in more detail.
Alternatively, while declaring the type of your function you could add proper lifetimes to indicate the resulting Future will be waited on in the same context by giving it a lifetime since your code waits for the future before the next iteration that would do the trick as well.
I encountered the same problem when dealing with async code. Here is what I figured out:
Let's say you have an Engine, that contains both incoming and outgoing:
struct Engine {
log: Arc<Mutex<Vec<String>>>,
outgoing: UnboundedSender<String>,
incoming: UnboundedReceiver<String>,
}
Our goal is to create two functions process_incoming and process_logic and then poll them simultaneously without messing up with the borrow checker in Rust.
What is important here is that:
You cannot pass &mut self to these async functions simultaneously.
Either incoming or outgoing will be only held by one function at most.
The data access by both process_incoming and process_logic need to be wrapped by a lock.
Any trying to lock Engine directly will lead to a deadlock at runtime.
So that leaves us giving up using the method in favor of the associated function:
impl Engine {
// ...
async fn process_logic(outgoing: &mut UnboundedSender<String>, log: Arc<Mutex<Vec<String>>>) {
loop {
Delay::new(Duration::from_millis(1000)).await.unwrap();
let msg: String = "ping".into();
println!("outgoing: {}", msg);
log.lock().push(msg.clone());
outgoing.send(msg).await.unwrap();
}
}
async fn process_incoming(
incoming: &mut UnboundedReceiver<String>,
log: Arc<Mutex<Vec<String>>>,
) {
while let Some(msg) = incoming.next().await {
println!("incoming: {}", msg);
log.lock().push(msg);
}
}
}
And we can then write main as:
fn main() {
futures::executor::block_on(async {
let mut engine = Engine::new();
let a = Engine::process_incoming(&mut engine.incoming, engine.log.clone()).fuse();
let b = Engine::process_logic(&mut engine.outgoing, engine.log).fuse();
futures::pin_mut!(a, b);
select! {
_ = a => {},
_ = b => {},
}
});
}
I put the whole example here.
It's a workable solution, only be aware that you should add futures and futures-timer in your dependencies.

Run async function in run_interval and return result

I need to run an async function in actix::prelude::AsyncContext::run_interval, but I need to also pass in a struct member and return the result (not the future). This is a somewhat more complex version of this question here. As can be seen in the commented section below, I have tried a few approaches but all of them fail for one reason or another.
I have looked at a few related resources, including the AsyncContext trait and these StackOverflow questions: 3, 4.
Here is my example code (actix crate is required in Cargo.toml):
use std::time::Duration;
use actix::{Actor, Arbiter, AsyncContext, Context, System};
struct MyActor {
id: i32
}
impl MyActor {
fn new(id: i32) -> Self {
Self {
id: id,
}
}
fn heartbeat(&self, ctx: &mut <Self as Actor>::Context) {
ctx.run_interval(Duration::from_secs(1), |act, ctx| {
//lifetime issue
//let res = 0;
//Arbiter::spawn(async {
// res = two(act.id).await;
//});
//future must return `()`
//let res = Arbiter::spawn(two(act.id));
//async closures unstable
//let res = Arbiter::current().exec(async || {
// two(act.id).await
//});
});
}
}
impl Actor for MyActor {
type Context = Context<Self>;
fn started(&mut self, ctx: &mut Self::Context) {
self.heartbeat(ctx);
}
}
// assume functions `one` and `two` live in another module
async fn one(id: i32) -> i32 {
// assume something is done with id here
let x = id;
1
}
async fn two(id: i32) -> i32 {
let x = id;
// assume this may call other async functions
one(x).await;
2
}
fn main() {
let mut system = System::new("test");
system.block_on(async { MyActor::new(10).start() });
system.run();
}
Rust version:
$ rustc --version
rustc 1.50.0 (cb75ad5db 2021-02-10)
Using Arbiter::spawn would work, but the issue is with the data being accessed from inside the async block that's passed to Arbiter::spawn. Since you're accessing act from inside the async block, that reference will have to live longer than the closure that calls Arbiter::spawn. In fact, in will have to have a lifetime of 'static since the future produced by the async block could potentially live until the end of the program.
One way to get around this in this specific case, given that you need an i32 inside the async block, and an i32 is a Copy type, is to move it:
ctx.run_interval(Duration::from_secs(1), |act, ctx| {
let id = act.id;
Arbiter::spawn(async move {
two(id).await;
});
});
Since we're using async move, the id variable will be moved into the future, and will thus be available whenever the future is run. By assigning it to id first, we are actually copying the data, and it's the copy (id) that will be moved.
But this might not be what you want, if you're trying to get a more general solution where you can access the object inside the async function. In that case, it gets a bit tricker, and you might want to consider not using async functions if possible. If you must, it might be possible to have a separate object with the data you need which is surrounded by std::rc::Rc, which can then be moved into the async block without duplicating the underlying data.

Why is Option<T>::and_then() not mutually exclusive from a following .unwrap_or()?

Why is it that Option::and_then() doesn't process exclusively from a following Option::unwrap_or()? Shouldn't the and_then() only happen if the Option is a Some() and then the .unwrap_or() only happens if the Option is a None? Here's a code example, the first method triggers a complaint from the borrow checker, and the later method does not, but theoretically shouldn't they be doing the same thing?
use std::collections::HashMap;
#[derive(Debug)]
struct Response {
account: String,
status: String,
error: String,
}
fn main() {
let num = String::from("426238");
let record = {
Response {
account: "".into(),
status: "failed".into(),
error: "Invalid Account".into(),
}
};
let mut acct_map = HashMap::new();
acct_map.insert(&num, record);
// Doesn't work
let record = acct_map.remove(&num)
.and_then(|mut r| {r.account = num; Some(r)}) // Should only get processed if there's a Some()
.unwrap_or( // Should only get processed if there's a None
Response {
account: num,
status: String::from("failed"),
error: String::from("The server did not return a response for this account."),
}
); // Yet rustc says variable moved from .and_then()
// Works
let num = String::from("426238");
let record = if let Some(mut response) = acct_map.remove(&num) {
response.account = num;
response
} else {
Response {
account: num,
status: String::from("failed"),
error: String::from("The server did not return a response for this account."),
}
};
}
After I got that complaint while trying the former, I switched to the later given it is more understandable and actually works, but I'm wondering if there is more behind .and_then() and .unwrap_or() than what my understanding is.
First of, since you used unwrap_or rather than unwrap_or_else, the parameter of unwrap_or will always be executed, which means it will always move out num.
Second, even if you had used unwrap_or_else, nothing in the signature of and_then or unwrap_or_else tell the borrow checker that these methods are mutually exclusive, therefore in its eyes, both lambdas could execute. This isn't allowed.
if let is the way to go here.
The closure used in your and_then() captures num by value. Although the execution of the .unwrap_or() is mutually exclusive with assigning num to r.account, the variable is still moved into the closure's scope.

Resources