I have a program that revolves around one shared data structure, proposing changes to the data, and then applying these changes at a later stage. These proposed changes hold references to the core object.
In C++ or another language, I would simply make the reference non-const, then mutate it when I need to. But Rust doesn't play well with this approach. (I asked about this in IRC earlier today, but sadly I'm still stuck.)
To help, I made a distilled example for booking tickets in a theatre, where theatre is the data structure, the Bookings are proposed changes, and the run method would be applying them if I could figure out how to get it to work!
Firstly, defining some data structures. A theatre has many rows, which have many seats each:
use std::sync::{Arc, RwLock};
use std::thread;
struct Theatre { rows: Vec<Row> }
struct Row { seats: Vec<Seat> }
struct Seat {
number: i32,
booked: bool,
}
impl Seat {
fn new(number: i32) -> Seat {
Seat { number: number, booked: false }
}
fn book(&mut self) {
self.booked = true;
}
}
Here, the get_booking method searches for a seat, returning a Booking with a reference to the seat it finds.
impl Theatre {
fn get_booking<'t>(&'t self, number: i32) -> Option<Booking<'t>> {
for row in self.rows.iter() {
for seat in row.seats.iter() {
if seat.number == number && seat.booked == false {
return Some(Booking { seats: vec![ seat ] })
}
}
}
None
}
}
But this is where I get stuck. The run method has mutable access to the overall theatre (from its parameter), and it knows which seat to mutate (self). But since self isn't mutable, even though the theatre that contains it is, it can't be mutated.
struct Booking<'t> {
seats: Vec<&'t Seat>
}
impl<'t> Booking<'t> {
fn describe(&self) {
let seats: Vec<_> = self.seats.iter().map(|s| s.number).collect();
println!("You want to book seats: {:?}", seats);
}
fn run(&self, _theatre: &mut Theatre) {
let mut seat = ??????;
seat.book();
}
}
Finally, a main method that would use it if it worked.
fn main() {
// Build a theatre (with only one seat... small theatre)
let theatre = Theatre { rows: vec![ Row { seats: vec![ Seat::new(7) ] } ] };
let wrapper = Arc::new(RwLock::new(theatre));
// Try to book a seat in another thread
let thread = thread::spawn(move || {
let desired_seat_number = 7;
let t = wrapper.read().unwrap();
let booking = t.get_booking(desired_seat_number).expect("No such seat!");
booking.describe();
let mut tt = wrapper.write().unwrap();
booking.run(&mut tt); // this is never actually reached because we still have the read lock
});
thread.join().unwrap();
}
What's annoying is that I know exactly why my current code doesn't work - I just can't figure out how Rust wants my program formatted instead. There are some things I don't want to do:
The simplest solution is to have Booking hold an index to its seat, instead of a reference: in this case, with row and seat usize fields. However, although my theatre uses O(1) vectors, I'd also like to reference a value in the middle of a large tree, where having to iterate to find the value would be much more expensive. This would also mean that you couldn't, say, get the seat number (in the describe function) without having to pass in the entire Theatre.
It would also be solved by having a Booking hold a mutable reference to the seat, which I could just then mutate as normal. However, this would mean I could only have one proposed change at a time: I couldn't, for example, have a list of bookings and apply them all at once, or have two bookings and only apply one.
I feel like I'm very close to having something that Rust will accept, but don't quite know how to structure my program to accommodate it. So, any pointers? (pun intended)
First, here's the code:
use std::sync::{Arc, RwLock};
use std::thread;
use std::sync::atomic::{AtomicBool, Ordering};
struct Theatre { rows: Vec<Row> }
struct Row { seats: Vec<Seat> }
struct Seat {
number: i32,
booked: AtomicBool,
}
impl Seat {
fn new(number: i32) -> Seat {
Seat { number: number, booked: AtomicBool::new(false) }
}
fn book(&self) {
self.booked.store(true, Ordering::Release);
println!("Booked seat: {:?}", self.number);
}
}
impl Theatre {
fn get_booking<'t>(&'t self, number: i32) -> Option<Booking<'t>> {
for row in self.rows.iter() {
for seat in row.seats.iter() {
if seat.number == number && seat.booked.load(Ordering::Acquire) == false {
return Some(Booking { seats: vec![ seat ] })
}
}
}
None
}
}
struct Booking<'t> {
seats: Vec<&'t Seat>
}
impl<'t> Booking<'t> {
fn describe(&self) {
let seats: Vec<_> = self.seats.iter().map(|s| s.number).collect();
println!("You want to book seats: {:?}", seats);
}
fn run(&self) {
for seat in self.seats.iter() {
seat.book();
}
}
}
fn main() {
// Build a theatre (with only one seat... small theatre)
let theatre = Theatre { rows: vec![ Row { seats: vec![ Seat::new(7) ] } ] };
let wrapper = Arc::new(RwLock::new(theatre));
// Try to book a seat in another thread
let thread = thread::spawn(move || {
let desired_seat_number = 7;
let t = wrapper.read().unwrap();
let booking = t.get_booking(desired_seat_number).expect("No such seat!");
booking.describe();
booking.run();
});
thread.join().unwrap();
}
View on playpen
There are two important changes:
The booked field was changed from bool to AtomicBool. The atomic types provide a store method that is available on immutable references. Therefore, we can make Seat::book() take self by immutable reference. If you have a more complex type that is not covered by the atomic types, you should instead use a Mutex or a RwLock.
I removed the &mut Theatre parameter on Booking::run(). If this is not acceptable, please leave a comment to explain why you need that reference.
As you found, you cannot have both a read lock and a write lock active at the same time on a RwLock. However, a Booking cannot live longer than the read lock on the Theatre, because it contains references inside the Theatre. Once you release the read lock, you cannot guarantee that the references you obtained will remain valid when you acquired another lock later on. If that's a problem, consider using Arc instead of simple borrowed pointers (&).
Related
The following code is used as an example of my problem. A structure named State contains a number of Residents.
Now there is a function that needs to modify both the State property and one of the Resident's properties.
Since it is not possible to get mutable borrows of State and one of the Residents in this State at the same time. The code can not compile.
I can think of two ways to solve it.
One is that just give only one parameter to modify_state_and_resident(): a mutable reference of State is provided. But I have to call the code to find Resident in hash map again in modify_state_and_resident(), which is expensive.
Another way is to split the State structure, splitting its properties and residents into separate variables. But this would bring other logical problems. After all, this is a complete entity that has to be referenced everywhere at the same time.
I don't know if there is a more perfect way to solve it.
#[derive(Debug)]
struct Resident {
age: i32,
name: String,
viewcnt: i32,
}
use std::collections::HashMap;
#[derive(Debug)]
struct State {
version: i32,
residents: HashMap<String, Resident>,
}
// This function cann not be invoked from modify_state_and_resident
fn show_resident(resident: &Resident) {
println!("{:?}", resident);
}
fn modify_state_and_resident(class: &mut State, resident: &mut Resident) {
// I do not want to call hash get again.
class.version = class.version + 1;
resident.viewcnt = resident.viewcnt + 1;
}
#[test]
fn whole_part_mutable() {
let mut s = State {
version: 1,
residents: HashMap::from([
(
String::from("this is a man who named Aaron"),
Resident{age: 18, name: String::from("Aaron"), viewcnt: 0}
),
])};
// get is expensive, I just want to call it when neccessary
let r = s.residents.get_mut("this is a man who named Aaron").unwrap();
// can not call from other function
show_resident(r);
modify_state_and_resident(&mut s, r);
}
You can destructure the State struct &mut to get individual access to both parts:
fn modify_state_and_resident(version: &mut i32, resident: &mut Resident) {
// I do not want to call hash get again.
*version = *version + 1;
resident.viewcnt = resident.viewcnt + 1;
}
#[test]
fn whole_part_mutable() {
let mut s = State {
version: 1,
residents: HashMap::from([
(
String::from("this is a man who named Aaron"),
Resident{age: 18, name: String::from("Aaron"), viewcnt: 0}
),
])};
let State { version, residents } = &mut s;
// get is expensive, I just want to call it when neccessary
let r = residents.get_mut("this is a man who named Aaron").unwrap();
// can not call from other function
show_resident(r);
modify_state_and_resident(version, r);
println!("{:?}", s);
}
Playground
This question already has answers here:
How do I mutate the item in Iterator::find's closure?
(2 answers)
Closed 1 year ago.
My code is below, and also on the playground.
use rand::Rng;
const THRESHOLD: i32 = 50;
#[derive(Debug)]
struct Game {
plays: Vec<i32>
}
impl Game {
fn new() -> Self {
Self {
plays: vec![]
}
}
/// A game wins when the sum of all plays exceeds the threshold
fn play(&mut self, play: i32) -> bool {
self.plays.push(play);
self.plays.iter().sum::<i32>() > THRESHOLD
}
}
fn main() {
// Build the games
let mut games: Vec<Game> = Vec::new();
let mut rng = rand::thread_rng();
for _ in 1..10 {
games.push(Game::new());
}
// Play the games & find a winner
loop {
if let Some(winner) = games
.iter_mut()
.find(|game| {
let play = rng.gen_range(1..=10);
game.play(play)
}) {
println!("Winner!: {:?}", winner);
}
}
}
The compiler doesn't like game.play(play) inside the predicate given to find saying:
`game` is a `&` reference, so the data it refers to cannot be borrowed as mutable
My attempts to dereference game have only further offended the borrow checker. What is the idiomatic way to call a mutating method inside a find predicate?
You can use Iterator::find_map instead, whose closure takes in elements by value which can then be easily mutated:
games
.iter_mut()
.find_map(|game| {
let play = rng.gen_range(1..=10);
game.play(play).then(|| game)
})
Playground
I have created a simplified version of my problem below, I have a Bag struct and Item struct. I want to spawn 10 threads that execute item_action method from Bag on each item in an item_list, and print a statement if both item's attributes are in the bag's attributes.
use std::sync::{Mutex,Arc};
use std::thread;
#[derive(Clone, Debug)]
struct Bag{
attributes: Arc<Mutex<Vec<usize>>>
}
impl Bag {
fn new(n: usize) -> Self {
let mut v = Vec::with_capacity(n);
for _ in 0..n {
v.push(0);
}
Bag{
attributes:Arc::new(Mutex::new(v)),
}
}
fn item_action(&self, item_attr1: usize, item_attr2: usize) -> Result<(),()> {
if self.attributes.lock().unwrap().contains(&item_attr1) ||
self.attributes.lock().unwrap().contains(&item_attr2) {
println!("Item attributes {} and {} are in Bag attribute list!", item_attr1, item_attr2);
Ok(())
} else {
Err(())
}
}
}
#[derive(Clone, Debug)]
struct Item{
item_attr1: usize,
item_attr2: usize,
}
impl Item{
pub fn new(item_attr1: usize, item_attr2: usize) -> Self {
Item{
item_attr1: item_attr1,
item_attr2: item_attr2
}
}
}
fn main() {
let mut item_list: Vec<Item> = Vec::new();
for i in 0..10 {
item_list.push(Item::new(i, (i+1)%10));
}
let bag: Bag= Bag::new(10); //create 10 attributes
let mut handles = Vec::with_capacity(10);
for x in 0..10 {
let bag2 = bag.clone();
let item_list2= item_list.clone();
handles.push(
thread::spawn(move || {
bag2.item_action(item_list2[x].item_attr1, item_list2[x].item_attr2);
})
)
}
for h in handles {
println!("Here");
h.join().unwrap();
}
}
When running, I only got one line, and the program just stops there without returning.
Item attributes 0 and 1 are in Bag attribute list!
May I know what went wrong? Please see code in Playground
Updated:
With suggestion from #loganfsmyth, the program can return now... but still only prints 1 line as above. I expect it to print 10 because my item_list has 10 items. Not sure if my thread logic is correct.
I have added println!("Here"); when calling join all threads. And I can see Here is printed 10 times, just not the actual log from item_action
I believe this is because Rust is not running your
if self.attributes.lock().unwrap().contains(&item_attr1) ||
self.attributes.lock().unwrap().contains(&item_attr2) {
expression in the order you expect. The evaluation order of subexpressions in Rust is currently undefined. What appears to be happening is that you essentially end up with
const condition = {
let lock1 = self.attributes.lock().unwrap();
let lock2 = self.attributes.lock().unwrap();
lock1.contains(&item_attr1) || lock2.contains(&item_attr2)
};
if condition {
which is causing your code to deadlock.
You should instead write:
let attributes = self.attributes.lock().unwrap();
if attributes.contains(&item_attr1) ||
attributes.contains(&item_attr2) {
so that there is only one lock.
Your code would also work as-is if you used an RwLock or ReentrantMutex instead of a Mutex since those allow the same thread to have multiple immutable references to the data.
I implemented a tree struct:
use std::collections::VecDeque;
use std::rc::{Rc, Weak};
use std::cell::RefCell;
struct A {
children: Option<VecDeque<Rc<RefCell<A>>>>
}
// I got thread '<main>' has overflowed its stack
fn main(){
let mut tree_stack: VecDeque<Rc<RefCell<A>>> = VecDeque::new();
// when num is 1000, everything works
for i in 0..100000 {
tree_stack.push_back(Rc::new(RefCell::new(A {children: None})));
}
println!("{:?}", "reach here means we are not out of mem");
loop {
if tree_stack.len() == 1 {break;}
let mut new_tree_node = Rc::new(RefCell::new(A {children: None}));
let mut tree_node_children: VecDeque<Rc<RefCell<A>>> = VecDeque::new();
// combine last two nodes to one new node
match tree_stack.pop_back() {
Some(x) => {
tree_node_children.push_front(x);
},
None => {}
}
match tree_stack.pop_back() {
Some(x) => {
tree_node_children.push_front(x);
},
None => {}
}
new_tree_node.borrow_mut().children = Some(tree_node_children);
tree_stack.push_back(new_tree_node);
}
}
Playpen link
But it crashes with
thread '<main>' has overflowed its stack
How do I fix that?
The problem that you are experiencing is because you have a giant linked-list of nodes. When that list is dropped, the first element tries to free all the members of the struct first. That means that the second element does the same, and so on, until the end of the list. This means that you will have a call stack that is proportional to the number of elements in your list!
Here's a small reproduction:
struct A {
children: Option<Box<A>>
}
fn main() {
let mut list = A { children: None };
for _ in 0..1_000_000 {
list = A { children: Some(Box::new(list)) };
}
}
And here's how you would fix it:
impl Drop for A {
fn drop(&mut self) {
if let Some(mut child) = self.children.take() {
while let Some(next) = child.children.take() {
child = next;
}
}
}
}
This code overrides the default recursive drop implementation with an iterative one. It rips the children out of the node, replacing it with a terminal item (None). It then allows the node to drop normally, but there will be no recursive calls.
The code is complicated a bit because we can't drop ourselves, so we need to do a little two-step dance to ignore the first item and then eat up all the children.
See also:
How can I swap in a new value for a field in a mutable reference to a structure?
How do I move out of a struct field that is an Option?
I have a set of jobs that I am trying to run in parallel. I want to run each task on its own thread and gather the responses on the calling thread.
Some jobs may take much longer than others, so I'd like to start using each result as it comes in, and not have to wait for all jobs to complete.
Here is an attempt:
struct Container<T> {
items : Vec<T>
}
#[derive(Debug)]
struct Item {
x: i32
}
impl Item {
fn foo (&mut self) {
self.x += 1; //consider an expensive mutating computation
}
}
fn main() {
use std;
use std::sync::{Mutex, Arc};
use std::collections::RingBuf;
//set up a container with 2 items
let mut item1 = Item { x: 0};
let mut item2 = Item { x: 1};
let container = Container { items: vec![item1, item2]};
//set a gather system for our results
let ringBuf = Arc::new(Mutex::new(RingBuf::<Item>::new()));
//farm out each job to its own thread...
for item in container.items {
std::thread::Thread::spawn(|| {
item.foo(); //job
ringBuf.lock().unwrap().push_back(item); //push item back to caller
});
}
loop {
let rb = ringBuf.lock().unwrap();
if rb.len() > 0 { //gather results as soon as they are available
println!("{:?}",rb[0]);
rb.pop_front();
}
}
}
For starters, this does not compile due to the impenetrable cannot infer an appropriate lifetime due to conflicting requirements error.
What am I doing wrong and how do I do it right?
You've got a couple compounding issues, but the first one is a misuse / misunderstanding of Arc. You need to give each thread it's own copy of the Arc. Arc itself will make sure that changes are synchronized. The main changes were the addition of .clone() and the move keyword:
for item in container.items {
let mrb = ringBuf.clone();
std::thread::Thread::spawn(move || {
item.foo(); //job
mrb.lock().unwrap().push_back(item); //push item back to caller
});
}
After changing this, you'll run into some simpler errors about forgotten mut qualifiers, and then you hit another problem - you are trying to send mutable references across threads. Your for loop will need to return &mut Item to call foo, but this doesn't match your Vec. Changing it, we can get to something that compiles:
for mut item in container.items.into_iter() {
let mrb = ringBuf.clone();
std::thread::Thread::spawn(move || {
item.foo(); //job
mrb.lock().unwrap().push_back(item); //push item back to caller
});
}
Here, we consume the input vector, moving each of the Items to the worker thread. Unfortunately, this hits the Playpen timeout, so there's probably some deeper issue.
All that being said, I'd highly recommend using channels:
#![feature(std_misc)]
use std::sync::mpsc::channel;
#[derive(Debug)]
struct Item {
x: i32
}
impl Item {
fn foo(&mut self) { self.x += 1; }
}
fn main() {
let items = vec![Item { x: 0 }, Item { x: 1 }];
let rx = {
let (tx, rx) = channel();
for item in items.into_iter() {
let my_tx = tx.clone();
std::thread::Thread::spawn(move || {
let mut item = item;
item.foo();
my_tx.send(item).unwrap();
});
}
rx
};
for item in rx.iter() {
println!("{:?}", item);
}
}
This also times-out in the playpen, but works fine when compiled and run locally.