How to avoid move of possibly-uninitialized variable for MutexGuard interface? - rust

for following code:
let conversation_model =
if lsm { CONVMODEL.lock().await } else {
conv_model_loader()
};
CONVMODEL.lock().await is MutexGuard<T> and conv_model_loader() is just T
I need common interface for those two so I can not copy-paste my code for two situations because it will only differ with this type, anything else is the same.
Edit:
there is code ... (at least what I was trying to do)
let (locked, loaded); // pun not intended
if lsm {
locked = CONVMODEL.lock().await;
} else {
loaded = conv_model_loader();
};
let mut chat_context = CHAT_CONTEXT.lock().await;
task::spawn_blocking(move || {
let conversation_model = if lsm { &*locked } else { &loaded };
but I've fialed becuse of
use of possibly-uninitialized variable: `locked`\nuse of possibly-uninitialized `locked`
So question is really how to have MutexGuard with interface &T but use it inside spawn_blocking and also with #[async_recursion]
Edit:
let (mut locked, mut loaded) = (None, None);
if lsm {
locked = Some( CONVMODEL.lock().await );
} else {
loaded = Some( conv_model_loader() );
};
let mut chat_context = CHAT_CONTEXT.lock().await;
task::spawn_blocking(move || {
let (lock, load);
let conversation_model =
if lsm {
lock = locked.unwrap();
&*lock
} else {
load = loaded.unwrap();
&load
};
following code is working but actually very ugly XD
(I wonder if it is possible to simplify this code)

Whenever you have some set of choices for a value, you want to reach for enum. For example, in Rust we don't do things like let value: T; let is_initialized: bool;, we do Option<T>.
You have a choice of two values, either an acquired mutex or a direct value. This is typically called "either", and there is a popular Rust crate containing this type: Either. For you it might look like:
use either::Either;
let conv_model = if lsm {
Either::Left(CONVMODEL.lock().await)
} else {
Either::Right(conv_model_loader())
};
tokio::task::spawn_blocking(move || {
let conversation_model = match &conv_model {
Either::Left(locked) => locked.deref(),
Either::Right(loaded) => loaded,
};
conversation_model.infer();
});
(Full example.)
This type used to live in the standard library, but was removed because it wasn't often used as it's fairly trivial to make a more descriptive domain-specific type. I agree with that, and you might do:
pub enum ConvModelSource {
Locked(MutexGuard<'static, ConvModel>),
Loaded(ConvModel),
}
impl Deref for ConvModelSource {
type Target = ConvModel;
fn deref(&self) -> &Self::Target {
match self {
Self::Locked(guard) => guard.deref(),
Self::Loaded(model) => model,
}
}
}
// ...
let conv_model = if lsm {
ConvModelSource::Locked(CONVMODEL.lock().await)
} else {
ConvModelSource::Loaded(conv_model_loader())
};
tokio::task::spawn_blocking(move || {
conv_model.infer();
});
(Full example.)
This is much more expressive, and moves the "how to populate this" away from where it's used.
In the common case you do want to use the simpler approach user4815162342 showed. You will store one of the temporaries, form a reference to it (knowing you just initialized it), and hand that back.
This doesn't work with spawn_blocking, however. The lifetime of the reference is that of the temporaries - handing such a reference off to a spawned task is a dangling reference.
This is why the error messages (of the form "borrowed value does not live long enough" and "argument requires that locked is borrowed for 'static") guided you to go down the path of trying to move locked and loaded into the closure to be in their final resting place, then form a reference. Then the reference wouldn't be dangling.
But then this implies you move a possibly-uninitialized value into the closure. Rust does not understand you are using an identical check to see which temporary value is populated. (You could imagine a typo on the second check doing !lsm and now you're switched up.)
Ultimately, you have to move the source of the value into the spawned task (closure) so that you form references with usable lifetimes. The use of enum is basically codifying your boolean case check into something Rust understands and will unpack naturally.

You can extract &mut T from both and use that. Something like the following should work:
let (locked, loaded); // pun not intended
let conversation_model = if lsm {
locked = CONVMODEL.lock().await;
&mut *locked
} else {
loaded = conv_model_loader();
&mut loaded
};

Related

Moving something that is in scope of a reference when those references will go out of scope immediately after the move

I am learning Rust coming from a C++ background. I am writing a parser that reads valid formulas such as 12 x (y - 4) and puts them in an abstract syntax tree (AST). I use an enum to store the possible nodes in the AST as you can see in the (heavily simplified) code. I ran into a problem. I want to simplify the expression -(-(12)) for instance by moving the 12, not copying. In general the 12 may be replaced by a deep AST. Currently I identify the situation in the function simplify( .. ) but it wont compile.
I do know why it doesn't compile: The node I'm trying to move (i.e. the 12 in my example) is still in scope as a reference from in the match clause so this is exactly rust preventing me from a possible problem with references. But in my case I know this is what I want, and moreover I exit the function right after I do the moving in the line *node = **child;, so the earlier references will go out of scope right there.
Is there an idiomatic Rust-esque type of way to solve this problem? I would really rather not copy the double--negated subtree.
#[derive(Debug)]
enum Node {
Num(i32),
UnaryMinus(Box<Node>),
}
use Node::*;
fn simplify(node: &mut Node) {
match &node {
Num(_) => (),
UnaryMinus(inner) => match &**inner {
Num(_) => (),
UnaryMinus(child) => {
// cannot move out of `**child` which is behind a shared reference
*node = **child;
return;
}
},
}
}
fn main() {
let double_minus = UnaryMinus(Box::new(UnaryMinus(Box::new(Num(12)))));
}
There are two problems here: The error you're getting concretely is because you're matching against &**inner instead of &mut **inner. It's never okay to move out of a shared reference, but there are circumstances (which we'll see) when it's okay to move out of a mutable one.
With that fixed, you will get the same error just about mutable references. That's because you can't move out of a mutable reference just because you know you're the only one holding it. A move leaves its source uninitialized, and a reference, mutable or shared, to uninitialized memory is never okay. You'll have to leave something in that memory that you're moving out of. You can do that with std::mem::swap. It takes two mutable references and swaps out the contents.
Now, one could obviously try to call std::mem::swap(node, &mut child) but this won't work simply because node is already mutably borrowed in the match expression and you can't mutably borrow something twice.
Moreover, this would leak memory as you now have a reference cycle where node -> inner -> node. This, although perfectly valid to do in Rust, usually isn't what you want.
Instead you'll need some sort of dummy that you can put in child's place. Some simple variant of your enum that can be safely dropped by inner once it gets dropped. In this example that could be Node::Num(0):
#[derive(Debug)]
enum Node {
Num(i32),
UnaryMinus(Box<Node>),
}
// This is just to verify that everything gets dropped properly
// and we don't leak any memory
impl Drop for Node {
fn drop(&mut self) {
println!("dropping {:?}", self);
}
}
use Node::*;
fn simplify(node: &mut Node) {
// `placeholder` will be our dummy
let (mut can_simplify, mut placeholder) = (false, Num(0));
match node {
Num(_) => (),
// we'll need to borrow `child` mutably later, so we have to
// match on `&mut **inner` not `&**inner`
UnaryMinus(inner) => match &mut **inner {
Num(_) => (),
UnaryMinus(ref mut child) => {
// move the contents of `child` into `placeholder` and vice versa
std::mem::swap(&mut placeholder, child);
can_simplify = true;
}
},
}
if can_simplify {
// now we can safely move into `node`
*node = placeholder;
// you could skip the conditional if all other, non-simplifying
// branches return before this statement
}
}
fn main() {
let mut double_minus = UnaryMinus(Box::new(UnaryMinus(Box::new(Num(12)))));
simplify(&mut double_minus);
println!("{:?}", double_minus);
}

Is there idiomatic way to handle "reference or create and use reference to it" pattern in Rust?

I realized that very often in Rust I need to do a following pattern:
let variable = &some_ref;
let variable = if something {
let new_variable = create_something();
&new_variable
} else {
variable
};
// Use variable here
In other words, I need to either use an existing reference or create a new owned value and use a reference to it.
But the problem is if I do it like in the example above the new_variable does not live long enough, it's dropped at the end of the first if clause.
Is there a idiomatic way to structure the code nicely to achieve the "use reference or create new and use reference to new" way? Or I just have to copy/make function for the code that uses variable 2 times - one for branch where I already have reference and another for branch where I create a owned value and that use reference to it?
Here is real-world example of how I usually use function (overlay_with_u32 in this case) to copy the behavior between 2 branches:
let source = &source;
if is_position_negative(x, y) {
let source = crop(source, x, y);
overlay_with_u32(destination, &source, x, y);
} else {
overlay_with_u32(destination, source, x, y);
};
You can use Cow(Clone-on-write) for this.
The Cow type is an emum of either an owned or a borrowed value:
pub enum Cow<'a, B>
where
B: 'a + ToOwned + ?Sized,
{
Borrowed(&'a B),
Owned(<B as ToOwned>::Owned),
}
You could use it something like (variables renamed for clarity):
use std::borrow::Cow;
let variable_ref = &some_ref;
let variable = if something {
let variable_created = create_something();
Cow::Owned(variable_created)
} else {
Cow::Borrowed(variable_ref)
};
Functions that accept a &T can be given a &Cow<T>, which will automatically be deferenced as you'd expect:
let variable: Cow<'_, i32> = Cow::Owned(3);
do_stuff(&variable);
fn do_stuff(i: &i32) {
println!("{}", i);
}
Using Cow may be the right thing to do, but here I'm going to suggest another approach, that might be cheaper (especially if the types are Drop-less, in this case it is zero-cost except it may require more stack space), but requires more code and may be less obvious.
Rust allows you to declare a variable but initialize it conditionally, as long as the compiler can prove that the variable is always initialized if it is used. You can exploit this fact to longer the lifetime of a variable inside a scope; instead of:
let reference = {
let variable = ...;
&variable
};
You can write:
let variable;
let reference = {
variable = ...;
&variable
};
But now variable lives long enough.
Applied to your case, it looks like:
let variable = &some_ref;
let new_variable;
let variable = if something {
new_variable = create_something();
&new_variable
} else {
variable
};

Return a reference to a T inside a lazy static RwLock<Option<T>>?

I have a lazy static struct that I want to be able to set to some random value in the beginning of the execution of the program, and then get later. This little silly snippet can be used as an example:
use lazy_static::lazy_static;
use std::sync::RwLock;
struct Answer(i8);
lazy_static! {
static ref ANSWER: RwLock<Option<Answer>> = RwLock::new(None);
}
fn answer_question() {
*ANSWER.write().unwrap() = Some(Answer(42));
}
fn what_is_the_answer() -> &'static Answer {
ANSWER
.read()
.unwrap()
.as_ref()
.unwrap()
}
This code fails to compile:
error[E0515]: cannot return value referencing temporary value
--> src/lib.rs:15:5
|
15 | ANSWER
| _____^
| |_____|
| ||
16 | || .read()
17 | || .unwrap()
| ||_________________- temporary value created here
18 | | .as_ref()
19 | | .unwrap()
| |__________________^ returns a value referencing data owned by the current function
I know you can not return a reference to a temporary value. But I want to return a reference to ANSWER which is static - the very opposite of temporary! I guess it is the RwLockReadGuard that the first call to unwrap returns that is the problem?
I can get the code to compile by changing the return type:
fn what_is_the_answer() -> RwLockReadGuard<'static, Option<Answer>> {
ANSWER
.read()
.unwrap()
}
But now the calling code becomes very unergonomic - I have to do two extra calls to get to the actual value:
what_is_the_answer().as_ref().unwrap()
Can I somehow return a reference to the static ANSWER from this function? Can I get it to return a RwLockReadGuard<&Answer> maybe by mapping somehow?
once_cell is designed for this: use .set(...).unwrap() in answer_question and .get().unwrap() in what_is_the_answer.
As far as I understand your intention, the value of Answer can't be computed while it is being initialized in the lazy_static but depends on parameters known only when answer_question is called. The following may not be the most elegant solution, yet it allows for having a &'static-reference to a value that depends on parameters only known at runtime.
The basic approach is to use two lazy_static-values, one of which serves as a "proxy" to do the necessary synchronization, the other being the value itself. This avoids having to access multiple layers of locks and unwrapping of Option-values whenever you access ANSWER.
The ANSWER-value is initialized by waiting on a CondVar, which will signal when the value has been computed. The value is then placed in the lazy_static and from then on unmovable. Hence &'static is possible (see get_the_answer()). I have chosen String as the example-type. Notice that accessing ANSWER without calling generate_the_answer() will cause the initialization to wait forever, deadlocking the program.
use std::{sync, thread};
lazy_static::lazy_static! {
// A proxy to synchronize when the value is generated
static ref ANSWER_PROXY: (sync::Mutex<Option<String>>, sync::Condvar) = {
(sync::Mutex::new(None), sync::Condvar::new())
};
// The actual value, which is initialized from the proxy and stays in place
// forever, hence allowing &'static access
static ref ANSWER: String = {
let (lock, cvar) = &*ANSWER_PROXY;
let mut answer = lock.lock().unwrap();
loop {
// As long as the proxy is None, the answer has not been generated
match answer.take() {
None => answer = cvar.wait(answer).unwrap(),
Some(answer) => return answer,
}
}
};
}
// Generate the answer and place it in the proxy. The `param` is just here
// to demonstrate we can move owned values into the proxy
fn generate_the_answer(param: String) {
// We don't need a thread here, yet we can
thread::spawn(move || {
println!("Generating the answer...");
let mut s = String::from("Hello, ");
s.push_str(&param);
thread::sleep(std::time::Duration::from_secs(1));
let (lock, cvar) = &*ANSWER_PROXY;
*lock.lock().unwrap() = Some(s);
cvar.notify_one();
println!("Answer generated.");
});
}
// Nothing to see here, except that we have a &'static reference to the answer
fn get_the_answer() -> &'static str {
println!("Asking for the answer...");
&ANSWER
}
fn main() {
println!("Hello, world!");
// Accessing `ANSWER` without generating it will deadlock!
//get_the_answer();
generate_the_answer(String::from("John!"));
println!("The answer is \"{}\"", get_the_answer());
// The second time a value is generated, noone is listening.
// This is the flipside of `ANSWER` being a &'static
generate_the_answer(String::from("Peter!"));
println!("The answer is still \"{}\"", get_the_answer());
}

Re-using values without declaring variables

In Kotlin, I can re-use values so:
"127.0.0.1:135".let {
connect(it) ?: System.err.println("Failed to connect to $it")
}
Is anything similar possible in Rust? To avoid using a temporary variable like this:
let text_address = "127.0.0.1:135";
TcpListener::bind(text_address).expect(format!("Failed to connect to {}", text_address));
According to this reference, T.let in Kotlin is a generic method-like function which runs a closure (T) -> R with the given value T passed as the first argument. From this perspective, it resembles a mapping operation from T to R. Under Kotlin's syntax though, it looks like a means of making a scoped variable with additional emphasis.
We could do the exact same thing in Rust, but it doesn't bring anything new to the table, nor makes the code cleaner (using _let because let is a keyword in Rust):
trait LetMap {
fn _let<F, R>(self, mut f: F) -> R
where
Self: Sized,
F: FnMut(Self) -> R,
{
f(self)
}
}
impl<T> LetMap for T {}
// then...
"something"._let(|it| {
println!("it = {}", it);
"good"
});
When dealing with a single value, it is actually more idiomatic to just declare a variable. If you need to constrain the variable (and/or the value's lifetime) to a particular scope, just place it in a block:
let conn = {
let text_address = "127.0.0.1:135";
TcpListener::bind(text_address)?
};
There is also one more situation worth mentioning: Kotlin has an idiom for nullable values where x?.let is used to conditionally perform something when the value isn't null.
val value = ...
value?.let {
... // execute this block if not null
}
In Rust, an Option already provides a similar feature, either through pattern matching or the many available methods with conditional execution: map, map_or_else, unwrap_or_else, and_then, and more.
let value: Option<_> = get_opt();
// 1: pattern matching
if let Some(non_null_value) = value {
// ...
}
// 2: functional methods
let new_opt_value: Option<_> = value.map(|non_null_value| {
"a new value"
}).and_then(some_function_returning_opt);
This is similar
{
let text_address = "127.0.0.1:135";
TcpListener::bind(text_address).expect(format!("Failed to connect to {}", text_address));
}
// now text_address is out of scope

How to consume and replace a value in an &mut ref [duplicate]

This question already has answers here:
How can I swap in a new value for a field in a mutable reference to a structure?
(2 answers)
Closed 5 years ago.
Sometimes I run into a problem where, due to implementation details that should be invisible to the user, I need to "destroy" a &mut and replace it in-memory. This typically ends up happening in recursive methods or IntoIterator implementations on recursive structures. It typically follows the form of:
fn create_something(self);
pub fn do_something(&mut self) {
// What you want to do
*self = self.create_something();
}
One example that I happened to have in my current project is in a KD Tree I've written, when I "remove" a node, instead of doing logic to rearrange the children, I just destructure the node I need to remove and rebuild it from the values in its subtrees:
// Some recursive checks to identify is this is our node above this
if let Node{point, left, right} = mem::replace(self, Sentinel) {
let points = left.into_iter().chain(right.into_iter()).collect();
(*self) = KDNode::new(points);
Some(point)
} else {
None
}
Another more in-depth example is the IntoIterator for this KDTree, which has to move a curr value out of the iterator, test it, and then replace it:
// temporarily swap self.curr with a dummy value so we can
// move out of it
let tmp = mem::replace(&mut self.curr, (Sentinel,Left));
match tmp {
// If the next node is a Sentinel, that means the
// "real" next node was either the parent, or we're done
(Sentinel,_) => {
if self.stack.is_empty() {
None
} else {
self.curr = self.stack.pop().expect("Could not pop iterator parent stack");
self.next()
}
}
// If the next node is to yield the current node,
// then the next node is it's right child's leftmost
// descendent. We only "load" the right child, and lazily
// evaluate to its left child next iteration.
(Node{box right,point,..},Me) => {
self.curr = (right,Left);
Some(point)
},
// Left is an instruction to lazily find this node's left-most
// non-sentinel child, so we recurse down, pushing the parents on the
// stack as we go, and then say that our next node is our right child.
// If this child doesn't exist, then it will be taken care of by the Sentinel
// case next call.
(curr # Node{..},Left) => {
let mut curr = curr;
let mut left = get_left(&mut curr);
while !left.is_sentinel() {
self.stack.push((curr,Me));
curr = left;
left = get_left(&mut curr);
}
let (right,point) = get_right_point(curr);
self.curr = (right, Left);
Some(point)
}
As you can see, my current method is to just use mem::replace with a dummy value, and then just overwrite the dummy value later. However, I don't like this for several reasons:
In some cases, there's no suitable dummy value. This is especially true if there's no public/easy way to construct a "zero value" for one or more of your struct members (e.g. what if the struct held a MutexGuard?). If the member you need to dummy-replace is in another module (or crate), you may be bound by difficult constraints of its construction that are undesireable when trying to build a dummy type.
The struct may be rather large, in which case doing more moves than is necessary may be undesirable (in practice, this is unlikely to be a big problem, admittedly).
It just "feels" unclean, since the "move" is technically more of an "update". In fact, the simplest example might be something like *self = self.next.do_something() which will still have problems.
In some cases, such as that first remove snippet I showed, you could perhaps more cleanly represent it as a fn do_something(self) -> Self, but in other cases such as the IntoIterator example this can't be done because you're constrained by the trait definition.
Is there any better, cleaner way to do this sort of in-place update?
In any case we'll need assignment, mem::replace, mem::swap, or something like that. Because given a &mut reference to an object there is no way to move this object (or any of it's fields) out without replacing it's memory area with something valid, as long as Rust forbids references to uninitialized memory.
As for dummy values for replacement, you can always make them yourself for any type by using some wrapper type. For example, I often use Option for this purpose, where Some(T) is the value of type T, and None acts as dummy. This is what I mean:
struct Tree<T>(Option<Node<T>>);
enum Node<T> {
Leaf(T),
Children(Vec<Tree<T>>),
}
impl<T> Tree<T> where T: PartialEq {
fn remove(&mut self, value: &T) {
match self.0.take() {
Some(Node::Leaf(ref leaf_value)) if leaf_value == value =>
(),
node # Some(Node::Leaf(..)) =>
*self = Tree(node),
Some(Node::Children(node_children)) => {
let children: Vec<_> =
node_children
.into_iter()
.filter_map(|mut tree| { tree.remove(value); tree.0 })
.map(|node| Tree(Some(node)))
.collect();
if !children.is_empty() {
*self = Tree(Some(Node::Children(children)));
}
},
None =>
panic!("something went wrong"),
}
}
}
playground link

Resources