cannot return value referencing local data - rust

I'm new to rust. The get_x509 function below creates a compiler warning "cannot return value referencing local data pem.contents" . I think I understand why - because the return value references pem.contents which is only in scope for that function - but I've not been able to work out how to get it to work.
The x509 functions in the code below come from the x509_parser crate
use x509_parser::prelude::*;
fn main() {
let cert = "";
get_x509(cert);
}
fn get_x509(cert: &str) -> X509Certificate {
let res_pem = parse_x509_pem(cert.as_bytes());
let x509_cert = match res_pem {
Ok((_, pem)) => {
let res_cert = parse_x509_certificate(&pem.contents);
match res_cert {
Ok((_, certificate)) => certificate,
Err(_err) => {
panic!("Parse failed")
}
}
}
Err(_err) => {
panic!("Parse failed")
}
};
return x509_cert;
}
I've tried making the cert variable a static value. If I inline the above code in the main() function, it works (but I have to match on &res_pem instead of res_pem).

According to x509-parser-0.14.0/src/certificate.rs, both the parameter and the return value of parse_x509_certificate have a lifetime 'a associated with them. One way to solve the problem is to divide get_x509 into two functions, and you can somehow avoid local reference in the second function which calls parse_x509_certificate.
The following code compiles (but will panic at runtime since cert is empty):
fn main() {
let cert = "";
let pem = get_x509_pem(cert);
get_x509(&pem); // Return value is unused.
}
use x509_parser::prelude::*;
fn get_x509_pem(cert: &str) -> Pem {
let (_, pem) = parse_x509_pem(cert.as_bytes()).expect("Parse failed");
pem
}
fn get_x509(pem: &Pem) -> X509Certificate {
let x509_cert = match parse_x509_certificate(&pem.contents) {
Ok((_, certificate)) => certificate,
Err(_err) => {
panic!("Parse failed")
}
};
x509_cert
}

The issue here, just as you've said, is that you have something that only lives in the context of the function, and you want to return a reference to it (or to some parts of it). But when the function execution is finished, the underlying data is removed, hence you would return a dangling reference - and Rust prevents this.
The way to go around this is (well illustrated by the answer of Simon Smith) to return the data you'd like to reference instead of just returning the reference. So in your case, you want to return the whole resp_pem object and then do any further data extraction.
Reading the documentation of the library, you seem to be in an unlucky situation where there is no way around moving res_pem out of the function into a static space, since parse_x509_pem returns owned data, and X509Certificate contains references. Hence, the lifetime of the returned certificate has to outlive the function, but the object you reference (res_pem) is owned by the function and is removed when the execution of the function is finished.

Related

Keep value used in closure for later use

fn callback_test() {
let image = HtmlImageElement::new().unwrap();
let callback: Closure<dyn Fn()> = {
Closure::wrap(Box::new(|| {
image.set_src("foo");
}))
};
image.set_onload(Some(&callback.as_ref().unchecked_ref()));
}
Here is a exemple of what I'm trying to achieve. If I don't use the move keyword before the closure declaration I get a lifetime error, and if I use it I can't assign my callback later in the function. What is the correct way to resolve this issue?
You will have to .clone() it.
The Closure only works with functions that are 'static, meaning they can't hold references to local variables. If it were to allow that, then calling that closure after callback_test() completes would try to use a dangling reference since image has already been dropped.
So move-ing it into the closure is the right move. And since you have to use it again after creating the closure, you will need two copies to work with.
Try this out:
fn callback_test() {
let image = HtmlImageElement::new().unwrap();
let callback: Closure<dyn Fn()> = {
let image = image.clone();
Closure::wrap(Box::new(move || {
image.set_src("foo");
}))
};
image.set_onload(Some(&callback.as_ref().unchecked_ref()));
}

How to avoid move of possibly-uninitialized variable for MutexGuard interface?

for following code:
let conversation_model =
if lsm { CONVMODEL.lock().await } else {
conv_model_loader()
};
CONVMODEL.lock().await is MutexGuard<T> and conv_model_loader() is just T
I need common interface for those two so I can not copy-paste my code for two situations because it will only differ with this type, anything else is the same.
Edit:
there is code ... (at least what I was trying to do)
let (locked, loaded); // pun not intended
if lsm {
locked = CONVMODEL.lock().await;
} else {
loaded = conv_model_loader();
};
let mut chat_context = CHAT_CONTEXT.lock().await;
task::spawn_blocking(move || {
let conversation_model = if lsm { &*locked } else { &loaded };
but I've fialed becuse of
use of possibly-uninitialized variable: `locked`\nuse of possibly-uninitialized `locked`
So question is really how to have MutexGuard with interface &T but use it inside spawn_blocking and also with #[async_recursion]
Edit:
let (mut locked, mut loaded) = (None, None);
if lsm {
locked = Some( CONVMODEL.lock().await );
} else {
loaded = Some( conv_model_loader() );
};
let mut chat_context = CHAT_CONTEXT.lock().await;
task::spawn_blocking(move || {
let (lock, load);
let conversation_model =
if lsm {
lock = locked.unwrap();
&*lock
} else {
load = loaded.unwrap();
&load
};
following code is working but actually very ugly XD
(I wonder if it is possible to simplify this code)
Whenever you have some set of choices for a value, you want to reach for enum. For example, in Rust we don't do things like let value: T; let is_initialized: bool;, we do Option<T>.
You have a choice of two values, either an acquired mutex or a direct value. This is typically called "either", and there is a popular Rust crate containing this type: Either. For you it might look like:
use either::Either;
let conv_model = if lsm {
Either::Left(CONVMODEL.lock().await)
} else {
Either::Right(conv_model_loader())
};
tokio::task::spawn_blocking(move || {
let conversation_model = match &conv_model {
Either::Left(locked) => locked.deref(),
Either::Right(loaded) => loaded,
};
conversation_model.infer();
});
(Full example.)
This type used to live in the standard library, but was removed because it wasn't often used as it's fairly trivial to make a more descriptive domain-specific type. I agree with that, and you might do:
pub enum ConvModelSource {
Locked(MutexGuard<'static, ConvModel>),
Loaded(ConvModel),
}
impl Deref for ConvModelSource {
type Target = ConvModel;
fn deref(&self) -> &Self::Target {
match self {
Self::Locked(guard) => guard.deref(),
Self::Loaded(model) => model,
}
}
}
// ...
let conv_model = if lsm {
ConvModelSource::Locked(CONVMODEL.lock().await)
} else {
ConvModelSource::Loaded(conv_model_loader())
};
tokio::task::spawn_blocking(move || {
conv_model.infer();
});
(Full example.)
This is much more expressive, and moves the "how to populate this" away from where it's used.
In the common case you do want to use the simpler approach user4815162342 showed. You will store one of the temporaries, form a reference to it (knowing you just initialized it), and hand that back.
This doesn't work with spawn_blocking, however. The lifetime of the reference is that of the temporaries - handing such a reference off to a spawned task is a dangling reference.
This is why the error messages (of the form "borrowed value does not live long enough" and "argument requires that locked is borrowed for 'static") guided you to go down the path of trying to move locked and loaded into the closure to be in their final resting place, then form a reference. Then the reference wouldn't be dangling.
But then this implies you move a possibly-uninitialized value into the closure. Rust does not understand you are using an identical check to see which temporary value is populated. (You could imagine a typo on the second check doing !lsm and now you're switched up.)
Ultimately, you have to move the source of the value into the spawned task (closure) so that you form references with usable lifetimes. The use of enum is basically codifying your boolean case check into something Rust understands and will unpack naturally.
You can extract &mut T from both and use that. Something like the following should work:
let (locked, loaded); // pun not intended
let conversation_model = if lsm {
locked = CONVMODEL.lock().await;
&mut *locked
} else {
loaded = conv_model_loader();
&mut loaded
};

Can I borrow values into a closure instead of moving them?

I'm writing a GET method for a server application written in actix-web. LMDB is the database I use, its transactions need to be aborted or committed before the end of their lifetime.
In order to avoid a bunch of nested matching, I tried using map_err on all of the functions that return a result. There I try to abort the transaction, but the transaction gets moved into the closure instead of being borrowed.
Is there any way to instead borrow the transaction into the closure, or do I have to bite the bullet and write a bunch of nested matches? Essentially, what's the most ergonomic way to write this function?
Example code (see comments next to txn.abort()):
pub async fn get_user(db: Data<Database>, id: Identity) -> Result<Json<User>, Error> {
let username = id.identity().ok_or_else(|| error::ErrorUnauthorized(""))?;
let txn = db
.env
.begin_ro_txn()
.map_err(|_| error::ErrorInternalServerError(""))?;
let user_bytes = txn.get(db.handle_users, &username).map_err(|e| {
txn.abort(); // txn gets moved here
match e {
lmdb::Error::NotFound => {
id.forget();
error::ErrorUnauthorized("")
}
_ => error::ErrorInternalServerError(""),
}
})?;
let user: User = serde_cbor::from_slice(user_bytes).map_err(|_| {
txn.abort(); // cannot use txn here as is was moved
error::ErrorInternalServerError("")
})?;
txn.abort(); // cannot use txn here as is was moved
Ok(Json(user))
}
Sadly, in my case it is impossible to borrow the value into the closure, because abort consumes the transaction. (Thanks to #vkurchatkin for explaining that)
In case anyone is interested, I've worked out a solution that satisfies me regardless of the issue. It was possible for me to avoid nesting a bunch of matches.
I moved all of the logic that works on the transaction into a separate function and then delayed the evaluation of the function Result until after running txn.abort() (see comments):
pub async fn get_user(db: Data<Database>, id: Identity) -> Result<Json<User>, Error> {
let username = id.identity().ok_or_else(|| error::ErrorUnauthorized(""))?;
let txn = db
.env
.begin_ro_txn()
.map_err(|_| error::ErrorInternalServerError(""))?;
let user = db_get_user(&db, &txn, &id, &username); // Execute separate function but do not evaluate the function Result yet, notice missing question mark operator!
txn.abort(); // Abort the transaction after running the code. (Doesn't matter if it was successful or not. This consumes the transaction and it cannot be used anymore.)
Ok(Json(user?)) // Now evaluate the Result using the question mark operator.
}
// New separate function that uses the transaction.
fn db_get_user(
db: &Database,
txn: &RoTransaction,
id: &Identity,
username: &str,
) -> Result<User, Error> {
let user_bytes = txn.get(db.handle_users, &username).map_err(|e| match e {
lmdb::Error::NotFound => {
id.forget();
error::ErrorUnauthorized("")
}
_ => error::ErrorInternalServerError(""),
})?;
serde_cbor::from_slice(user_bytes).map_err(|_| error::ErrorInternalServerError(""))
}

Re-using values without declaring variables

In Kotlin, I can re-use values so:
"127.0.0.1:135".let {
connect(it) ?: System.err.println("Failed to connect to $it")
}
Is anything similar possible in Rust? To avoid using a temporary variable like this:
let text_address = "127.0.0.1:135";
TcpListener::bind(text_address).expect(format!("Failed to connect to {}", text_address));
According to this reference, T.let in Kotlin is a generic method-like function which runs a closure (T) -> R with the given value T passed as the first argument. From this perspective, it resembles a mapping operation from T to R. Under Kotlin's syntax though, it looks like a means of making a scoped variable with additional emphasis.
We could do the exact same thing in Rust, but it doesn't bring anything new to the table, nor makes the code cleaner (using _let because let is a keyword in Rust):
trait LetMap {
fn _let<F, R>(self, mut f: F) -> R
where
Self: Sized,
F: FnMut(Self) -> R,
{
f(self)
}
}
impl<T> LetMap for T {}
// then...
"something"._let(|it| {
println!("it = {}", it);
"good"
});
When dealing with a single value, it is actually more idiomatic to just declare a variable. If you need to constrain the variable (and/or the value's lifetime) to a particular scope, just place it in a block:
let conn = {
let text_address = "127.0.0.1:135";
TcpListener::bind(text_address)?
};
There is also one more situation worth mentioning: Kotlin has an idiom for nullable values where x?.let is used to conditionally perform something when the value isn't null.
val value = ...
value?.let {
... // execute this block if not null
}
In Rust, an Option already provides a similar feature, either through pattern matching or the many available methods with conditional execution: map, map_or_else, unwrap_or_else, and_then, and more.
let value: Option<_> = get_opt();
// 1: pattern matching
if let Some(non_null_value) = value {
// ...
}
// 2: functional methods
let new_opt_value: Option<_> = value.map(|non_null_value| {
"a new value"
}).and_then(some_function_returning_opt);
This is similar
{
let text_address = "127.0.0.1:135";
TcpListener::bind(text_address).expect(format!("Failed to connect to {}", text_address));
}
// now text_address is out of scope

Assign value from match statement

I'm trying to make a Git command in Rust. I'm using the clap argument parser crate to do the command line handling. I want my command to take an optional argument for which directory to do work in. If the command does not receive the option it assumes the users home directory.
I know that I can use the std::env::home_dir function to get the user's home directory if it is set but the part that confuses me is how to properly use the match operator to get the value of the path. Here is what I've been trying:
use std::env;
fn main() {
// Do some argument parsing stuff...
let some_dir = if matches.is_present("some_dir") {
matches.value_of("some_dir").unwrap()
} else {
match env::home_dir() {
Some(path) => path.to_str(),
None => panic!("Uh, oh!"),
}
};
// Do more things
I get an error message when I try to compile this saying that path.to_str() doesn't live long enough. I get that the value returned from to_str lives for the length of the match scope but how can you return a value from a match statement that has to call another function?
path.to_str() will return a &str reference to the inner string contained in path, which will only live as long as path, that is inside the match arm.
You can use to_owned to get an owned copy of that &str. You will have to adapt the value from clap accordingly to have the same types in both branches of your if:
let some_dir = if matches.is_present("some_dir") {
matches.value_of("some_dir").unwrap().to_owned()
} else {
match env::home_dir() {
Some(path) => path.to_str().unwrap().to_owned(),
None => panic!("Uh, oh!"),
}
};
Alternatively, you could use Cow to avoid the copy in the first branch:
use std::borrow::Cow;
let some_dir: Cow<str> = if matches.is_present("some_dir") {
matches.value_of("some_dir").unwrap().into()
} else {
match env::home_dir() {
Some(path) => path.to_str().unwrap().to_owned().into(),
None => panic!("Uh, oh!"),
}
};
What is happening is that the scope of the match statement takes ownership of the PathBuf object returned from env::home_dir(). You then attempt to return a reference to that object, but the object ceases to exist immediately.
The solution is to return PathBuf rather than a reference to it (or convert it to a String and return that instead, in any case, it has to be some type that owns the data). You may have to change what matches.value_of("some_dir").unwrap() returns so that both branches return the same type.
There is a rather simple trick: increase the scope of path (and thus its lifetime) so that you can take a reference into it.
use std::env;
fn main() {
// Do some argument parsing stuff...
let path; // <--
let some_dir = if matches.is_present("some_dir") {
matches.value_of("some_dir").unwrap()
} else {
match env::home_dir() {
Some(p) => { path = p; path.to_str().unwrap() },
None => panic!("Uh, oh!"),
}
};
// Do more things
}
It is efficient, as path is only ever used when necessary, and does not require changing the types in the program.
Note: I added an .unwrap() after .to_str() because .to_str() returns an Option. And do note that the reason it returns an Option<&str> is because not all paths are valid UTF-8 sequences. You might want to stick to Path/PathBuf.

Resources