Rust matching and borrow checker

Rust matching and borrow checker - rust

I keep stumbling on a pattern in my Rust programs that always puts me at odds with the borrow-checker. Consider the following toy example:
use std::sync::{Arc,RwLock};
pub struct Test {
thing: i32,
}
pub struct Test2 {
pub test: Arc<RwLock<Test>>,
pub those: i32,
}
impl Test {
pub fn foo(&self) -> Option<i32> {
Some(3)
}
}
impl Test2 {
pub fn bar(&mut self) {
let mut test_writer = self.test.write().unwrap();
match test_writer.foo() {
Some(thing) => {
self.add(thing);
},
None => {}
}
}
pub fn add(&mut self, addme: i32) {
self.those += addme;
}
}
This doesn't compile because the add function in the Some arm tries to borrow self mutably, which was already borrowed immutably just above the match statement in order to open the read-write lock.
I've encountered this pattern a few times in Rust, mainly when using RwLock. I've also found a workaround, namely by introducing a boolean before the match statement and then changing the value of the boolean in the Some arm and then finally introducing a test on this boolean after the match statement to do whatever it is I wanted to do in the Some arm.
It just seems to me that that's not the way to go about it, I assume there's a more idiomatic way to do this in Rust - or solve the problem in an entirely different way - but I can't find it. If I'm not mistaken the problem has to do with lexical borrowing so self cannot be mutably borrowed within the arms of the match statement.
Is there an idiomatic Rust way to solve this sort of problem?

Use directly the field those, for example with custom type:
use std::sync::{Arc,RwLock};
pub struct Those(i32);
impl Those {
fn get(&self) -> i32 {
self.0
}
fn add(&mut self, n: i32) {
self.0 += n;
}
}
pub struct Test {
thing: Those,
}
pub struct Test2 {
pub test: Arc<RwLock<Test>>,
pub those: Those,
}
impl Test {
pub fn foo(&self) -> Option<Those> {
Some(Those(3))
}
}
impl Test2 {
pub fn bar(&mut self) {
let mut test_writer = self.test.write().unwrap();
match test_writer.foo() {
Some(thing) => {
// call a method add directly on your type to get around the borrow checker
self.those.add(thing.get());
},
None => {}
}
}
}

You either need to end borrow of a part of self, before mutating self
pub fn bar1(&mut self) {
let foo = self.test.write().unwrap().foo();
match foo {
Some(thing) => {
self.add(thing);
},
None => {}
}
}
or directly mutate non borrowed part of self
pub fn bar2(&mut self) {
let test_writer = self.test.write().unwrap();
match test_writer.foo() {
Some(thing) => {
self.those += thing;
},
None => {}
}
}

Related

Allowing extension (beyond the crate) of implementation with event loop

Within the crate we can happily do something like this:
mod boundary {
pub struct EventLoop;
impl EventLoop {
pub fn run(&self) {
for _ in 0..2 {
self.handle("bundled");
self.foo();
}
}
pub fn handle(&self, message: &str) {
println!("{} handling", message)
}
}
pub trait EventLoopExtend {
fn foo(&self);
}
}
use boundary::EventLoopExtend;
impl EventLoopExtend for boundary::EventLoop {
fn foo(&self) {
self.handle("extended")
}
}
fn main() {
let el = boundary::EventLoop{};
el.run();
}
But if mod boundary were a crate boundary we get error[E0117]: only traits defined in the current crate can be implemented for arbitrary types.
I gather that a potential solution to this could be the New Type idiom, so something like this:
mod boundary {
pub struct EventLoop;
impl EventLoop {
pub fn run(&self) {
for _ in 0..2 {
self.handle("bundled");
self.foo();
}
}
pub fn handle(&self, message: &str) {
println!("{} handling", message)
}
}
pub trait EventLoopExtend {
fn foo(&self);
}
impl EventLoopExtend for EventLoop {
fn foo(&self) {
self.handle("unimplemented")
}
}
}
use boundary::{EventLoop, EventLoopExtend};
struct EventLoopNewType(EventLoop);
impl EventLoopExtend for EventLoopNewType {
fn foo(&self) {
self.0.handle("extended")
}
}
fn main() {
let el = EventLoopNewType(EventLoop {});
el.0.run();
}
But then the problem here is that the extended trait behaviour isn't accessible from the underlying EventLoop instance.
I'm still quite new to Rust, so I'm sure I'm missing something obvious, I wouldn't be surprised if I need to take a completely different approach.
Specifically in my case, the event loop is actually from wgpu, and I'm curious if it's possible to build a library where end users can provide their own "render pass" stage.

Thanks to #AlexN's comment I dug deeper into the Strategy Pattern and found a solution:
mod boundary {
pub struct EventLoop<'a, T: EventLoopExtend> {
extension: &'a T
}
impl<'a, T: EventLoopExtend> EventLoop<'a, T> {
pub fn new(extension: &'a T) -> Self {
Self { extension }
}
pub fn run(&self) {
for _ in 0..2 {
self.handle("bundled");
self.extension.foo(self);
}
}
pub fn handle(&self, message: &str) {
println!("{} handling", message)
}
}
pub trait EventLoopExtend {
fn foo<T: EventLoopExtend>(&self, el: &EventLoop<T>) {
el.handle("unimplemented")
}
}
}
use boundary::{EventLoop, EventLoopExtend};
struct EventLoopExtension;
impl EventLoopExtend for EventLoopExtension {
fn foo<T: EventLoopExtend>(&self, el: &EventLoop<T>) {
el.handle("extended")
}
}
fn main() {
let el = EventLoop::new(&EventLoopExtension {});
el.run();
}
The basic idea is to use generics with a trait bound. I think the first time I looked into this approach I was worried about type recursion. But it turns out passing the EventLoop object as an argument to EventLoopExtend trait methods is perfectly reasonable.

How do I implement an iterator from a vector of std::Rc<std::RefCell<T>> smart pointers?

I'm trying to understand how to work with interior mutability. This question is strongly related to my previous question.
I have a generic struct Port<T> that owns a Vec<T>. We can "chain" port B to port A so, when reading the content of port A, we are able to read the content of port B. However, this chaining is hidden to port A's reader. That is why I implemented the iter(&self) method:
use std::rc::Rc;
pub struct Port<T> {
values: Vec<T>,
ports: Vec<Rc<Port<T>>>,
}
impl <T> Port<T> {
pub fn new() -> Self {
Self { values: vec![], ports: vec![] }
}
pub fn add_value(&mut self, value: T) {
self.values.push(value);
}
pub fn is_empty(&self) -> bool {
self.values.is_empty() && self.ports.is_empty()
}
pub fn chain_port(&mut self, port: Rc<Port<T>>) {
if !port.is_empty() {
self.ports.push(port)
}
}
pub fn iter(&self) -> impl Iterator<Item = &T> {
self.values.iter().chain(
self.ports.iter()
.flat_map(|p| Box::new(p.iter()) as Box<dyn Iterator<Item = &T>>)
)
}
pub fn clear(&mut self) {
self.values.clear();
self.ports.clear();
}
}
The application has the following pseudo-code behavior:
create ports
loop:
fill ports with values
chain ports
iterate over ports' values
clear ports
The main function should look like this:
fn main() {
let mut port_a = Rc::new(Port::new());
let mut port_b = Rc::new(Port::new());
loop {
port_a.add_value(1);
port_b.add_value(2);
port_a.chain_port(port_b.clone());
for val in port_a.iter() {
// read data
};
port_a.clear();
port_b.clear();
}
}
However, the compiler complains:
error[E0596]: cannot borrow data in an `Rc` as mutable
--> src/modeling/port.rs:46:9
|
46 | port_a.add_value(1);
| ^^^^^^ cannot borrow as mutable
|
= help: trait `DerefMut` is required to modify through a dereference, but it is not implemented for `Rc<Port<i32>>`
I've been reading several posts etc., and it seems that I need to work with Rc<RefCell<Port<T>>> to be able to mutate the ports. I changed the implementation of Port<T>:
use std::cell::RefCell;
use std::rc::Rc;
pub struct Port<T> {
values: Vec<T>,
ports: Vec<Rc<RefCell<Port<T>>>>,
}
impl<T> Port<T> {
// snip
pub fn chain_port(&mut self, port: Rc<RefCell<Port<T>>>) {
if !port.borrow().is_empty() {
self.ports.push(port)
}
}
pub fn iter(&self) -> impl Iterator<Item = &T> {
self.values.iter().chain(
self.ports
.iter()
.flat_map(|p| Box::new(p.borrow().iter()) as Box<dyn Iterator<Item = &T>>),
)
}
// snip
}
This does not compile either:
error[E0515]: cannot return value referencing temporary value
--> src/modeling/port.rs:35:31
|
35 | .flat_map(|p| Box::new(p.borrow().iter()) as Box<dyn Iterator<Item = &T>>),
| ^^^^^^^^^----------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| | |
| | temporary value created here
| returns a value referencing data owned by the current function
I think I know what the problem is: p.borrow() returns a reference to the port being chained. We use that reference to create the iterator, but as soon as the function is done, the reference goes out of scope and the iterator is no longer valid.
I have no clue on how to deal with this. I managed to implement the following unsafe method:
pub fn iter(&self) -> impl Iterator<Item = &T> {
self.values.iter().chain(self.ports.iter().flat_map(|p| {
Box::new(unsafe { (&*p.as_ref().as_ptr()).iter() }) as Box<dyn Iterator<Item = &T>>
}))
}
While this works, it uses unsafe code, and there must be a safe workaround.
I set a playground for more details of my application. The application compiles and outputs the expected result (but uses unsafe code).

You can't modify anything behind an Rc, that's correct. While this might be solved with a RefCell, you don't want to go down that road. You might come into a situation where you'd need to enforce a specific clean() order or similar horrors.
More important: your main is fundamentally flawed, ownership-wise. Take these lines:
let mut port_a = Port::new();
let mut port_b = Port::new();
loop {
// Creates an iummutable borrow of port_b with same lifetime as port_a!
port_a.chain_port(port_b);
// ...
// A mutable borrow of port_b.
// But the immutable borrow from above persists across iterations.
port_b.clear();
// Or, even if you do fancy shenanigans at least until this line.
port_a.clear();
}
To overcome this, just constrain the ports lifetime to one iteration. You currently manually clean them up anyway, so that's already what you're doing conceptually.
Also, I got rid of that recursive iteration, just to simplify things a little more.
#[derive(Clone)]
pub struct Port<'a, T> {
values: Vec<T>,
ports: Vec<&'a Port<'a, T>>,
}
impl<'a, T> Port<'a, T> {
pub fn new() -> Self {
Self {
values: vec![],
ports: vec![],
}
}
pub fn add_value(&mut self, value: T) {
self.values.push(value);
}
pub fn is_empty(&self) -> bool {
self.values.is_empty() && self.ports.is_empty()
}
pub fn chain_port(&mut self, port: &'a Port<T>) {
if !port.is_empty() {
self.ports.push(&port)
}
}
pub fn iter(&self) -> impl Iterator<Item = &T> {
let mut port_stack: Vec<&Port<T>> = vec![self];
// Sensible estimate I guess.
let mut values: Vec<&T> = Vec::with_capacity(self.values.len() * (self.ports.len() + 1));
while let Some(port) = port_stack.pop() {
values.append(&mut port.values.iter().collect());
port_stack.extend(port.ports.iter());
}
values.into_iter()
}
}
fn main() {
loop {
let mut port_a = Port::new();
let mut port_b = Port::new();
port_a.add_value(1);
port_b.add_value(2);
port_a.chain_port(&port_b);
print!("values in port_a: [ ");
for val in port_a.iter() {
print!("{} ", val);
}
println!("]");
}
}

Iterate over struct Hashmap and add value to another portion of self

Currently I have a struct that looks something like the following:
struct foo {
pending: HashMap<K, V>,
loaded: Vec<V>,
}
I'm trying to write a function that will load all the pending hashmap values into loaded, while deleting any values from pending if they do not meet a certain criterion, this is what I've got right now:
pub async fn load(&mut self) {
let mut rm_vector = Vec::new();
for (key, value) in self.pending.iter_mut() {
self.add_to_loaded(value).await.unwrap_or_else( |err| {
if err == Error::SomeError {
rm_vector.push(key.clone())
}
})
}
for value in rm_vector {
self.pending.remove(&value);
}
}
The issue that I am having is that I am getting a second mutable borrow error when calling the first for loop and the if statement right after. I was wondering if anybody had any suggestions how to go about fixing this without cloning the entire hashmap.

I use the following simplified code to replicate the issue.
use std::collections::HashMap;
struct Foo {
pending: HashMap<String, u32>,
loaded: Vec<u32>,
}
impl Foo {
fn add_to_loaded(&mut self, v: u32) {
self.loaded.push(v);
}
}
impl Foo {
pub fn load(&mut self) {
for _ in self.pending.iter_mut() {
self.add_to_loaded(30);
}
}
}
The cause I think is both iter_mut() and add_to_loaded(&mut) try to mutable borrow self.
I can think of two ways to go around the problem.
Option A:
Declare the field loaded as RefCell<Vec>. With it, the first parameter of add_to_loaded can be declared as an immutable reference of self and use iter() instead of iter_mut() to iterate the HashMap. There will be no mutable borrows in the for statement. In add_to_loaded, use borrow_mut() to get a mutable reference to modify the loaded field.
use std::collections::HashMap;
use std::cell::RefCell;
struct Foo {
pending: HashMap<String, u32>,
loaded: RefCell<Vec<u32>>,
}
impl Foo {
fn add_to_loaded(&self, v: u32) {
self.loaded.borrow_mut().push(v);
}
}
impl Foo {
pub fn load(&mut self) {
for _ in self.pending.iter() {
self.add_to_loaded(30);
}
}
}
Option B:
Split the two fields and the two methods into two structs.
use std::collections::HashMap;
struct Foo {
pending: HashMap<String, u32>,
}
struct Foo2 {
loaded: Vec<u32>,
}
impl Foo {
pub fn load(&mut self, f: &mut Foo2) {
for _ in self.pending.iter() {
f.add_to_loaded(30);
}
}
}
impl Foo2 {
fn add_to_loaded(&mut self, v: u32) {
self.loaded.push(v);
}
}

Storing types in a HashMap to dynamically instantiate them

I am trying to store structs in a HashMap keyed by string so that I can later create new objects by string. Think of a REST API where clients can get the server to instantiate a specific object by supplying a name.
use std::collections::HashMap;
struct MyStruct;
impl MyStruct {
pub fn new() -> Self {
Self {}
}
}
struct MyOtherStruct;
impl MyOtherStruct {
pub fn new() -> Self {
Self {}
}
}
fn main() {
let mut h = HashMap::new();
h.insert("MyStruct", MyStruct);
h.insert("MyOtherStruct", MyOtherStruct);
// This is pseudo-code
let obj = h.get("MyStruct").unwrap()::new();
}
As I expected, this doesn't work due to syntax errors:
error: expected one of `.`, `;`, `?`, or an operator, found `::`
--> src/main.rs:25:41
|
25 | let obj = h.get("MyStruct").unwrap()::new();
| ^^ expected one of `.`, `;`, `?`, or an operator here
My second attempt was to store a reference to the new method of each struct instead of the types themselves.
use std::collections::HashMap;
struct MyStruct;
impl MyStruct {
pub fn new() -> Self {
Self {}
}
}
struct MyOtherStruct;
impl MyOtherStruct {
pub fn new() -> Self {
Self {}
}
}
fn main() {
let mut h = HashMap::new();
h.insert("MyStruct", &MyStruct::new);
h.insert("MyOtherStruct", &MyOtherStruct::new);
let obj = h.get("MyStruct").unwrap()();
}
This fails because the fn items have different types and can't be stored in the same HashMap:
error[E0308]: mismatched types
--> src/main.rs:22:31
|
22 | h.insert("MyOtherStruct", &MyOtherStruct::new);
| ^^^^^^^^^^^^^^^^^^^ expected fn item, found a different fn item
|
= note: expected type `&fn() -> MyStruct {MyStruct::new}`
found type `&fn() -> MyOtherStruct {MyOtherStruct::new}`
Since I'm pretty new to Rust, I'm out of ideas. How can I solve this problem?

This is ultimately fundamentally impossible. In Rust, local variables are stored on the stack, which means that they have to have a fixed size, known at compile time. Your construction requires the size of the value on the stack to be determined at runtime.
The closest alternative is to move to trait objects, which introduce a layer of indirection:
use std::collections::HashMap;
trait NewThing {
fn new(&self) -> Box<Thing>;
}
trait Thing {}
struct MyStruct;
impl NewThing for MyStruct {
fn new(&self) -> Box<Thing> {
Box::new(Self {})
}
}
impl Thing for MyStruct {}
struct MyOtherStruct;
impl NewThing for MyOtherStruct {
fn new(&self) -> Box<Thing> {
Box::new(Self {})
}
}
impl Thing for MyOtherStruct {}
fn main() {
let mut h: HashMap<_, Box<NewThing>> = HashMap::new();
h.insert("MyStruct", Box::new(MyStruct));
h.insert("MyOtherStruct", Box::new(MyOtherStruct));
let obj = h["MyStruct"].new();
}
You will find this pattern out in the world, such as in hyper's NewService.
what is [the value of &self of method new] when calling h["MyStruct"].new()
It's an instance of MyStruct or MyOtherStruct. The only reason that the same type can implement both traits is because there's no real unique state for the "factory" and the "instance". In more complicated implementations, these would be two different types.
Using the same type is common for such cases as sharing a reference-counted value.
See also:
Is it possible to have a constructor function in a trait?

Here is a more complex example of #Shepmaster's solution, using different types for Factories and the objects themselves:
use std::collections::HashMap;
trait NewThing {
fn new(&self) -> Box<Thing>;
}
trait Thing {
fn execute(&mut self);
}
// MyStruct
struct MyStructFactory;
impl NewThing for MyStructFactory {
fn new(&self) -> Box<Thing> {
Box::new(MyStruct {test: 12, name: "Test".into()})
}
}
struct MyStruct {
test: i32,
name: String
}
impl Thing for MyStruct {
fn execute(&mut self) {
self.test+=1;
println!("MyStruct {} {}", self.test, self.name);
}
}
// MyOtherStruct
struct MyOtherStructFactory;
impl NewThing for MyOtherStructFactory {
fn new(&self) -> Box<Thing> {
Box::new(MyOtherStruct {my_member: 1})
}
}
struct MyOtherStruct {
my_member: u32
}
impl Thing for MyOtherStruct {
fn execute(&mut self) { println!("MyOtherStruct.my_member: {}", self.my_member); }
}
fn main() {
let mut h: HashMap<_, Box<NewThing>> = HashMap::new();
h.insert("MyStruct", Box::new(MyStructFactory));
h.insert("MyOtherStruct", Box::new(MyOtherStructFactory));
h["MyStruct"].new().execute();
h["MyOtherStruct"].new().execute();
}

You could use std::any::Any to erase the type of the entry. They use Any::downcast<T> to check if the entry at the location matches your type, and get a Ok(Box<T>)

Force fields to share lifetimes

I'm using Rust 0.13, and am rather new to Rust.
I have a struct that would like to own a string input, but I have code that would like to work with slices of that string, work.
pub struct Lexer<'a> {
input : Option<String>,
work : &'a str,
...
}
My goal is to pass a string to the struct, have it create its own copy, then to create an initial slice pointing to that string. Ideally, I can now use this slice to manipulate it, as the memory backing the slice won't ever change.
pub fn input(&mut self, input : String) {
self.input = Some(input.clone());
self.work = self.input.unwrap().as_slice();
}
impl<'lex> Iterator<Token> for Lexer<'lex> {
fn next(&mut self) -> Option<Token> {
// ...Do work...
match regex!("\\S").find(self.work) {
Some((0, end)) => {
// Cheap to move the view around
self.work = self.work.slice_from(end);
},
_ => ()
}
// ... Do more work ...
}
}
However, this doesn't work because the lifetime is too short:
error: borrowed value does not live long enough
self.work = self.input.unwrap().as_slice();
^~~~~~~~~~~~~~~~~~~
I'm interpreting this to mean that self.input could change, invalidating self.work's view.
Is this a reasonable interpretation?
Is there a way to specify that these fields are tied to each other somehow?
I think if I could specify that Lexer.input is final this would work, but it doesn't look like Rust has a way to do this.
Edit: sample calling code
let mut lexer = lex::Lexer::new();
lexer.add("[0-9]+", Token::NUM);
lexer.add("\\+", Token::PLUS);
for line in io::stdin().lock().lines() {
match line {
Ok(input) => {
lexer.input(input.as_slice());
lexer.lex();
},
Err(e) => ()
}
}

I think your issue can be solved by adding one more layer. You can have one layer that collects the rules of your lexer, and then you create a new struct that actually does the lexing. This is parallel to how the iterators in Rust are implemented themselves!
struct MetaLexer<'a> {
rules: Vec<(&'a str, u32)>,
}
impl<'a> MetaLexer<'a> {
fn new() -> MetaLexer<'a> { MetaLexer { rules: Vec::new() } }
fn add_rule(&mut self, name: &'a str, val: u32) {
self.rules.push((name, val));
}
fn lex<'r, 's>(&'r self, s: &'s str) -> Lexer<'a, 's, 'r> {
Lexer {
rules: &self.rules,
work: s,
}
}
}
struct Lexer<'a : 'r, 's, 'r> {
rules: &'r [(&'a str, u32)],
work: &'s str,
}
impl<'a, 's, 'r> Iterator for Lexer<'a, 's, 'r> {
type Item = u32;
fn next(&mut self) -> Option<u32> {
for &(name, val) in self.rules.iter() {
if self.work.starts_with(name) {
self.work = &self.work[name.len()..];
return Some(val);
}
}
None
}
}
fn main() {
let mut ml = MetaLexer::new();
ml.add_rule("hello", 10);
ml.add_rule("world", 3);
for input in ["hello", "world", "helloworld"].iter() {
// So that we have an allocated string,
// like io::stdin().lock().lines() might give us
let input = input.to_string();
println!("Input: '{}'", input);
for token in ml.lex(&input) {
println!("Token was: {}", token);
}
}
}
Really, you could rename MetaLexer -> Lexer and Lexer -> LexerItems, and then you'd really match the iterators in the standard lib.
If your question is really how do I keep references to the data read from stdin, that's a different question, and very far from your original statement.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Rust matching and borrow checker - rust

Related

Allowing extension (beyond the crate) of implementation with event loop

How do I implement an iterator from a vector of std::Rc<std::RefCell<T>> smart pointers?

Iterate over struct Hashmap and add value to another portion of self

Storing types in a HashMap to dynamically instantiate them

Force fields to share lifetimes

Categories

Resources