Accessing two vectors in a struct locked by a mutex [duplicate] - rust

This question already has an answer here:
Error while trying to borrow 2 fields from a struct wrapped in RefCell
(1 answer)
Closed 3 years ago.
I have a struct with two vectors that is passed through a function while in a Arc<Mutex<TwoArrays>>.
pub struct TwoArrays {
pub a: Vec<i32>,
pub b: Vec<i32>,
}
fn add_arrays(mut foo: Arc<Mutex<TwoArrays>>) {
let mut f = foo.lock().unwrap();
//Loop A: compiles
for i in 0..f.a.len() {
for j in 0..f.b.len() {
f.b[j] += f.a[i];
}
}
//Loop B: does not compile
for i in f.a.iter() {
for j in 0..f.b.len() {
f.b[j] += i;
}
}
}
When I make a loop that uses an iterator, with another loop writing inside(Loop B), the compiler complains:
error[E0502]: cannot borrow `f` as mutable because it is also borrowed as immutable
Loop A Compiles.
Why is there a immutable borrow on f?
Can I make it only borrow each array individually? That is, a mutable borrow of f.b and a immutable borrow of f.a?
Why does this not happen when I pass TwoArrays directly? It only happens when I pass it in as an Arc<Mutex<TwoArrays>>

When you unwrap the LockResult you get a MutexGuard, and not directly a TwoArrays. You can use it as if it was a TwoArrays because it implements Deref and DerefMut.
When you try to write 2 loops, you try to use both deref and deref_mut at once: that's impossible:
pub struct TwoArrays {
pub a: Vec<i32>,
pub b: Vec<i32>,
}
fn add_arrays(mut foo: Arc<Mutex<TwoArrays>>) {
let mut f = foo.lock().unwrap();
//Loop B: does not compile
for i in f.a.iter() {
// ^~~~~~~~~~~~~~~~~~~ Implicit call to `deref` here.
for j in 0..f.b.len() {
// ^~~~~~~~~~~~ Another implicit call to `deref` here.
f.b[j] += i;
// ^~~~~~~~~~~~~~~~~~~~ Implicit call to `deref_mut` here.
}
}
}
If you deref_mut once before doing the loops, everything works fine:
use std::{sync::{Arc, Mutex}, ops::DerefMut};
pub struct TwoArrays {
pub a: Vec<i32>,
pub b: Vec<i32>,
}
fn add_arrays(foo: &mut Arc<Mutex<TwoArrays>>) {
let mut mutex_guard = foo.lock().unwrap();
let real_two_arrays = mutex_guard.deref_mut();
for i in &mut real_two_arrays.a {
for j in &real_two_arrays.b {
*i += *j;
}
}
}

You can access to two vector in a struct like following:
use std::mem;
#[derive(Debug)]
pub struct TwoArrays {
pub a: Vec<i32>,
pub b: Vec<i32>,
}
fn add_arrays(mut foo: TwoArrays) {
let a = foo.a.clone();
let mut b = foo.b.clone();
for i in a.iter() {
let mut index = 0;
for _j in b.iter_mut() {
let mut new_value = i.clone() + foo.b[index as usize].clone();
mem::swap(&mut foo.b[index as usize], &mut new_value);
index = index + 1;
}
}
println!("Arrays A: {:?}", &foo.a);
println!("Arrays A: {:?}", &foo.b);
}
fn main() {
let a = vec![1i32, 2i32, 3i32];
let b = vec![4i32, 5i32, 6i32];
let two_arrays = TwoArrays { a, b };
// let foo = Arc::new(Mutex::new(two_arrays));
add_arrays(two_arrays);
}
Playground
Why is there an immutable borrow for f?
Because you tried to iterate it with iter() not the iter_mut()
And why does this not happen when I pass TwoArrays directly? It only happens when I pass it in as an Arc<Mutex<TwoArrays>>
You can pass it as naked struct without the need of Arc<Mutex<>> with the sample code.
If you insist on using Arc<Mutex<>> to pass the same object around with an atomic reference you can change the function signature to the following:
fn add_arrays(mut foo: Arc<Mutex<TwoArrays>>)
And you need to lock() and get that reference from the Arc with following:
let foo = foo.lock().unwrap();
Playground With Arc Usage

Related

When is it safe to extend the lifetime of references into Arenas?

I have a struct which uses Arena:
struct Foo<'f> {
a: Arena<u64>, // from the typed-arena crate
v: Vec<&'f u64>,
}
Is it safe to extend the lifetime of a reference into the arena so long as it is bound by the lifetime of the main struct?
impl<'f> Foo<'f> {
pub fn bar(&mut self, n: u64) -> Option<&'f u64> {
if n == 0 {
None
} else {
let n_ref = unsafe { std::mem::transmute(self.a.alloc(n)) };
Some(n_ref)
}
}
}
For more context, see this Reddit comment.
Is it safe to extend the lifetime of a reference into the arena so long as it is bound by the lifetime of the main struct?
The Arena will be dropped along with Foo so, in principle, this would be safe, but it would also be unnecessary because the Arena already lives long enough.
However, this is not what your code is actually doing! The lifetime 'f could be longer that the lifetime of the struct — it can be as long as the shortest-lived reference inside v. For example:
fn main() {
let n = 1u64;
let v = vec![&n];
let bar;
{
let mut foo = Foo { a: Arena::new(), v };
bar = foo.bar(2);
// foo is dropped here, along with the Arena
}
// bar is still useable here because 'f is the full scope of `n`
println!("bar = {:?}", bar); // Some(8021790808186446178) - oops!
}
Trying to pretend that the lifetime is longer than it really is has created an opportunity for Undefined Behaviour in safe code.
A possible fix is to own the Arena outside of the struct and rely on the borrow checker to make sure that it is not dropped while it is still in use:
struct Foo<'f> {
a: &'f Arena<u64>,
v: Vec<&'f u64>,
}
impl<'f> Foo<'f> {
pub bar(&mut self, n: u64) -> Option<&'f u64> {
if n == 0 {
None
} else {
Some(self.a.alloc(n))
}
}
}
fn main() {
let arena = Arena::new();
let n = 1u64;
let v = vec![&n];
let bar;
{
let mut foo = Foo { a: &arena, v };
bar = foo.bar(2);
}
println!("bar = {:?}", bar); // Some(2)
}
Just like your unsafe version, the lifetimes express that the reference to the Arena must be valid for at least as long as the items in the Vec. However, this also enforces that this fact is true! Since there is no unsafe code, you can trust the borrow-checker that this will not trigger UB.

How do I efficiently build a vector and an index of that vector while processing a data stream?

I have a struct Foo:
struct Foo {
v: String,
// Other data not important for the question
}
I want to handle a data stream and save the result into Vec<Foo> and also create an index for this Vec<Foo> on the field Foo::v.
I want to use a HashMap<&str, usize> for the index, where the keys will be &Foo::v and the value is the position in the Vec<Foo>, but I'm open to other suggestions.
I want to do the data stream handling as fast as possible, which requires not doing obvious things twice.
For example, I want to:
allocate a String only once per one data stream reading
not search the index twice, once to check that the key does not exist, once for inserting new key.
not increase the run time by using Rc or RefCell.
The borrow checker does not allow this code:
let mut l = Vec::<Foo>::new();
{
let mut hash = HashMap::<&str, usize>::new();
//here is loop in real code, like:
//let mut s: String;
//while get_s(&mut s) {
let s = "aaa".to_string();
let idx: usize = match hash.entry(&s) { //a
Occupied(ent) => {
*ent.get()
}
Vacant(ent) => {
l.push(Foo { v: s }); //b
ent.insert(l.len() - 1);
l.len() - 1
}
};
// do something with idx
}
There are multiple problems:
hash.entry borrows the key so s must have a "bigger" lifetime than hash
I want to move s at line (b), while I have a read-only reference at line (a)
So how should I implement this simple algorithm without an extra call to String::clone or calling HashMap::get after calling HashMap::insert?
In general, what you are trying to accomplish is unsafe and Rust is correctly preventing you from doing something you shouldn't. For a simple example why, consider a Vec<u8>. If the vector has one item and a capacity of one, adding another value to the vector will cause a re-allocation and copying of all the values in the vector, invalidating any references into the vector. This would cause all of your keys in your index to point to arbitrary memory addresses, thus leading to unsafe behavior. The compiler prevents that.
In this case, there's two extra pieces of information that the compiler is unaware of but the programmer isn't:
There's an extra indirection — String is heap-allocated, so moving the pointer to that heap allocation isn't really a problem.
The String will never be changed. If it were, then it might reallocate, invalidating the referred-to address. Using a Box<[str]> instead of a String would be a way to enforce this via the type system.
In cases like this, it is OK to use unsafe code, so long as you properly document why it's not unsafe.
use std::collections::HashMap;
#[derive(Debug)]
struct Player {
name: String,
}
fn main() {
let names = ["alice", "bob", "clarice", "danny", "eustice", "frank"];
let mut players = Vec::new();
let mut index = HashMap::new();
for &name in &names {
let player = Player { name: name.into() };
let idx = players.len();
// I copied this code from Stack Overflow without reading the prose
// that describes why this unsafe block is actually safe
let stable_name: &str = unsafe { &*(player.name.as_str() as *const str) };
players.push(player);
index.insert(idx, stable_name);
}
for (k, v) in &index {
println!("{:?} -> {:?}", k, v);
}
for v in &players {
println!("{:?}", v);
}
}
However, my guess is that you don't want this code in your main method but want to return it from some function. That will be a problem, as you will quickly run into Why can't I store a value and a reference to that value in the same struct?.
Honestly, there's styles of code that don't fit well within Rust's limitations. If you run into these, you could:
decide that Rust isn't a good fit for you or your problem.
use unsafe code, preferably thoroughly tested and only exposing a safe API.
investigate alternate representations.
For example, I'd probably rewrite the code to have the index be the primary owner of the key:
use std::collections::BTreeMap;
#[derive(Debug)]
struct Player<'a> {
name: &'a str,
data: &'a PlayerData,
}
#[derive(Debug)]
struct PlayerData {
hit_points: u8,
}
#[derive(Debug)]
struct Players(BTreeMap<String, PlayerData>);
impl Players {
fn new<I>(iter: I) -> Self
where
I: IntoIterator,
I::Item: Into<String>,
{
let players = iter
.into_iter()
.map(|name| (name.into(), PlayerData { hit_points: 100 }))
.collect();
Players(players)
}
fn get<'a>(&'a self, name: &'a str) -> Option<Player<'a>> {
self.0.get(name).map(|data| Player { name, data })
}
}
fn main() {
let names = ["alice", "bob", "clarice", "danny", "eustice", "frank"];
let players = Players::new(names.iter().copied());
for (k, v) in &players.0 {
println!("{:?} -> {:?}", k, v);
}
println!("{:?}", players.get("eustice"));
}
Alternatively, as shown in What's the idiomatic way to make a lookup table which uses field of the item as the key?, you could wrap your type and store it in a set container instead:
use std::collections::BTreeSet;
#[derive(Debug, PartialEq, Eq)]
struct Player {
name: String,
hit_points: u8,
}
#[derive(Debug, Eq)]
struct PlayerByName(Player);
impl PlayerByName {
fn key(&self) -> &str {
&self.0.name
}
}
impl PartialOrd for PlayerByName {
fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
Some(self.cmp(other))
}
}
impl Ord for PlayerByName {
fn cmp(&self, other: &Self) -> std::cmp::Ordering {
self.key().cmp(&other.key())
}
}
impl PartialEq for PlayerByName {
fn eq(&self, other: &Self) -> bool {
self.key() == other.key()
}
}
impl std::borrow::Borrow<str> for PlayerByName {
fn borrow(&self) -> &str {
self.key()
}
}
#[derive(Debug)]
struct Players(BTreeSet<PlayerByName>);
impl Players {
fn new<I>(iter: I) -> Self
where
I: IntoIterator,
I::Item: Into<String>,
{
let players = iter
.into_iter()
.map(|name| {
PlayerByName(Player {
name: name.into(),
hit_points: 100,
})
})
.collect();
Players(players)
}
fn get(&self, name: &str) -> Option<&Player> {
self.0.get(name).map(|pbn| &pbn.0)
}
}
fn main() {
let names = ["alice", "bob", "clarice", "danny", "eustice", "frank"];
let players = Players::new(names.iter().copied());
for player in &players.0 {
println!("{:?}", player.0);
}
println!("{:?}", players.get("eustice"));
}
not increase the run time by using Rc or RefCell
Guessing about performance characteristics without performing profiling is never a good idea. I honestly don't believe that there'd be a noticeable performance loss from incrementing an integer when a value is cloned or dropped. If the problem required both an index and a vector, then I would reach for some kind of shared ownership.
not increase the run time by using Rc or RefCell.
#Shepmaster already demonstrated accomplishing this using unsafe, once you have I would encourage you to check how much Rc actually would cost you. Here is a full version with Rc:
use std::{
collections::{hash_map::Entry, HashMap},
rc::Rc,
};
#[derive(Debug)]
struct Foo {
v: Rc<str>,
}
#[derive(Debug)]
struct Collection {
vec: Vec<Foo>,
index: HashMap<Rc<str>, usize>,
}
impl Foo {
fn new(s: &str) -> Foo {
Foo {
v: s.into(),
}
}
}
impl Collection {
fn new() -> Collection {
Collection {
vec: Vec::new(),
index: HashMap::new(),
}
}
fn insert(&mut self, foo: Foo) {
match self.index.entry(foo.v.clone()) {
Entry::Occupied(o) => panic!(
"Duplicate entry for: {}, {:?} inserted before {:?}",
foo.v,
o.get(),
foo
),
Entry::Vacant(v) => v.insert(self.vec.len()),
};
self.vec.push(foo)
}
}
fn main() {
let mut collection = Collection::new();
for foo in vec![Foo::new("Hello"), Foo::new("World"), Foo::new("Go!")] {
collection.insert(foo)
}
println!("{:?}", collection);
}
The error is:
error: `s` does not live long enough
--> <anon>:27:5
|
16 | let idx: usize = match hash.entry(&s) { //a
| - borrow occurs here
...
27 | }
| ^ `s` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
The note: at the end is where the answer is.
s must outlive hash because you are using &s as a key in the HashMap. This reference will become invalid when s is dropped. But, as the note says, hash will be dropped after s. A quick fix is to swap the order of their declarations:
let s = "aaa".to_string();
let mut hash = HashMap::<&str, usize>::new();
But now you have another problem:
error[E0505]: cannot move out of `s` because it is borrowed
--> <anon>:22:33
|
17 | let idx: usize = match hash.entry(&s) { //a
| - borrow of `s` occurs here
...
22 | l.push(Foo { v: s }); //b
| ^ move out of `s` occurs here
This one is more obvious. s is borrowed by the Entry, which will live to the end of the block. Cloning s will fix that:
l.push(Foo { v: s.clone() }); //b
I only want to allocate s only once, not cloning it
But the type of Foo.v is String, so it will own its own copy of the str anyway. Just that type means you have to copy the s.
You can replace it with a &str instead which will allow it to stay as a reference into s:
struct Foo<'a> {
v: &'a str,
}
pub fn main() {
// s now lives longer than l
let s = "aaa".to_string();
let mut l = Vec::<Foo>::new();
{
let mut hash = HashMap::<&str, usize>::new();
let idx: usize = match hash.entry(&s) {
Occupied(ent) => {
*ent.get()
}
Vacant(ent) => {
l.push(Foo { v: &s });
ent.insert(l.len() - 1);
l.len() - 1
}
};
}
}
Note that, previously I had to move the declaration of s to before hash, so that it would outlive it. But now, l holds a reference to s, so it has to be declared even earlier, so that it outlives l.

Dereferencing boxed struct and moving its field causes it to be moved

Dereferencing a boxed struct and moving its field causes it to be moved, but doing it in another way works just fine. I don't understand the difference between these two pop functions. How does one fail when the other one doesn't?
pub struct Stack<T> {
head: Option<Box<Node<T>>>,
len: usize,
}
struct Node<T> {
element: T,
next: Option<Box<Node<T>>>,
}
impl<T> Stack<T> {
pub fn pop(&mut self) -> Option<T> {
self.head.take().map(|boxed_node| {
let node = *boxed_node;
self.head = node.next;
node.element
})
}
pub fn pop_causes_error(&mut self) -> Option<T> {
self.head.take().map(|boxed_node| {
self.head = (*boxed_node).next;
(*boxed_node).element
})
}
}
error[E0382]: use of moved value: `boxed_node`
--> src/main.rs:22:13
|
21 | self.head = (*boxed_node).next;
| ------------------ value moved here
22 | (*boxed_node).element
| ^^^^^^^^^^^^^^^^^^^^^ value used here after move
|
= note: move occurs because `boxed_node.next` has type `std::option::Option<std::boxed::Box<Node<T>>>`, which does not implement the `Copy` trait
You can only move out of a box once:
struct S;
fn main() {
let x = Box::new(S);
let val: S = *x;
let val2: S = *x; // <-- use of moved value: `*x`
}
In the first function, you moved the value out of the box and assigned it to the node variable. This allows you to move different fields out of it. Even if one field is moved, other fields are still available. Equivalent to this:
struct S1 {
a: S2,
b: S2,
}
struct S2;
fn main() {
let x = Box::new(S1 { a: S2, b: S2 });
let tmp: S1 = *x;
let a = tmp.a;
let b = tmp.b;
}
In the second function, you move the value to the temporary (*boxed_node) and then move a field out of it. The temporary value is destroyed immediately after the end of the expression, along with its other fields. The box doesn't have the data anymore, and you don't have a variable to take the other field from. Equivalent to this:
struct S1 {
a: S2,
b: S2,
}
struct S2;
fn main() {
let x = Box::new(S1 { a: S2, b: S2 });
let tmp: S1 = *x;
let a = tmp.a;
let tmp: S1 = *x; // <-- use of moved value: `*x`
let b = tmp.b;
}
Some good news is that non-lexical lifetimes will allow your original code to work:
pub struct Stack<T> {
head: Option<Box<Node<T>>>,
len: usize,
}
struct Node<T> {
element: T,
next: Option<Box<Node<T>>>,
}
impl<T> Stack<T> {
pub fn pop_no_longer_causes_error(&mut self) -> Option<T> {
self.head.take().map(|boxed_node| {
self.head = (*boxed_node).next;
(*boxed_node).element
})
}
}
NLL enhances the borrow checker to better track the moves of variables.

Calling two member functions that work with different fields at the same time [duplicate]

It seems that if you borrow a reference to a struct field, the whole struct is considered borrowed. I've managed to isolate and example of what I want to do. I just want to get a "read-only" reference to a field in B to obtain some data and then modify another field of B. Is there a idiomatic Rust way to do this?
struct A {
i: i32,
}
struct B {
j: i32,
a: Box<A>,
}
impl B {
fn get<'a>(&'a mut self) -> &'a A {
&*self.a
}
fn set(&mut self, j: i32) {
self.j = j
}
}
fn foo(a: &A) -> i32 {
a.i + 1
}
fn main() {
let a = Box::new(A { i: 47 });
let mut b = B { a: a, j: 1 };
let a_ref = b.get();
b.set(foo(a_ref));
}
error[E0499]: cannot borrow `b` as mutable more than once at a time
--> src/main.rs:27:5
|
26 | let a_ref = b.get();
| - first mutable borrow occurs here
27 | b.set(foo(a_ref));
| ^ second mutable borrow occurs here
28 | }
| - first borrow ends here
It's a feature of the language. From the compiler point of view, there is no way for it to know that it's safe to call your set() function while a is borrowed via get().
Your get() function borrows b mutably, and returns a reference, thus b will remain borrowed until this reference goes out of scope.
You have several way of handling this:
Separate your two fields into two different structs
Move the code which needs to access both attribute inside a method of B
Make your attributes public, you will thus be able to directly get references to them
Compute the new value before setting it, like this:
fn main() {
let a = Box::new(A { i: 47 });
let mut b = B { a: a, j: 1 };
let newval = {
let a_ref = b.get();
foo(a_ref)
};
b.set(newval);
}
Expanding a bit on Levans' answer
Move the code which needs to access both attribute inside a method of B
That might look like this at a first pass:
impl B {
fn do_thing(&mut self) {
self.j = self.a.foo()
}
}
However, this hard-codes the call to foo. You could also accept a closure to allow this to be more flexible:
impl B {
fn update_j_with_a<F>(&mut self, f: F)
where
F: FnOnce(&mut A) -> i32,
{
self.j = f(&mut self.a)
}
}
// ...
b.update_j_with_a(|a| a.foo())
Separate your two fields into two different structs
This is also applicable when you have borrowed two disjoint subsets of attributes. For example:
struct A {
description: String,
name: String,
age: u8,
money: i32,
}
impl A {
fn update_description(&mut self) {
let description = &mut self.description;
*description = self.build_description()
// cannot borrow `*self` as immutable because `self.description` is also borrowed as mutable
}
fn build_description(&self) -> String {
format!(
"{} is {} years old and has {} money",
self.name,
self.age,
self.money
)
}
}
fn main() {}
Can be changed into
struct A {
description: String,
info: Info,
}
struct Info {
name: String,
age: u8,
money: i32,
}
impl A {
fn update_description(&mut self) {
let description = &mut self.description;
*description = self.info.build_description()
}
}
impl Info {
fn build_description(&self) -> String {
format!(
"{} is {} years old and has {} money",
self.name,
self.age,
self.money
)
}
}
fn main() {}
You can combine these two steps (and I'd say that it's better practice) and move the method onto the inner struct.

How can I use `index_mut` to get a mutable reference?

Even when I implement IndexMut for my struct, I cannot get a mutable reference to an element of structure inner vector.
use std::ops::{Index, IndexMut};
struct Test<T> {
data: Vec<T>,
}
impl<T> Index<usize> for Test<T> {
type Output = T;
fn index<'a>(&'a self, idx: usize) -> &'a T {
return &self.data[idx];
}
}
impl<T> IndexMut<usize> for Test<T> {
fn index_mut<'a>(&'a mut self, idx: usize) -> &'a mut T {
// even here I cannot get mutable reference to self.data[idx]
return self.data.index_mut(idx);
}
}
fn main() {
let mut a: Test<i32> = Test { data: Vec::new() };
a.data.push(1);
a.data.push(2);
a.data.push(3);
let mut b = a[1];
b = 10;
// will print `[1, 2, 3]` instead of [1, 10, 3]
println!("[{}, {}, {}]", a.data[0], a.data[1], a.data[2]);
}
How can I use index_mut to get a mutable reference? Is it possible?
You're almost there. Change this:
let mut b = a[1];
b = 10;
to this:
let b = &mut a[1];
*b = 10;
Indexing syntax returns the value itself, not a reference to it. Your code extracts one i32 from your vector and modifies the variable - naturally, it does not affect the vector itself. In order to obtain a reference through the index, you need to write it explicitly.
This is fairly natural: when you use indexing to access elements of a slice or an array, you get the values of the elements, not references to them, and in order to get a reference you need to write it explicitly.

Resources