I have a data structure, that is somewhat like a RwLock. It is not re-entrant, but the actual locking mechanism is a const function. Is there any way that I can mark this function as "exclusive_borrow" without switching it to be be a mutable function. That way multiple calls to 'read' will be caught at compile time instead of panicking.
struct MyRwLock<T> {
t: T,
}
impl MyRwLock {
// Works fine, but doesn't enforce on compile time that there is
// only 1 Guard.
pub fn read(&self) -> ReadGuard<'_, T> { ... }
// Enforces only 1 ReadGuard at compile time, but unnecessarily
// requires MyMutex to be mutable to read.
pub fn mut_read(&mut self) -> ReadGuard<'_, T> { ... }
}
&mut is a bit of a misnomer. It actually means exclusive reference, not mutable. If that's what you want, then it's correct to use &mut.
In fact, there was a proposal to rename &mut back in 2014. It never came through, but you might occasionally hear whispers of the "mutpocalypse" today.
Related
I've got a struct that contains some locked data. The real world is complex, but here's a minimal example (or as minimal as I can make it):
use std::fmt::Display;
use std::ops::{Index, IndexMut};
use std::sync::Mutex;
struct LockedVector<T> {
stuff: Mutex<Vec<T>>,
}
impl<T> LockedVector<T> {
pub fn new(v: Vec<T>) -> Self {
LockedVector {
stuff: Mutex::new(v),
}
}
}
impl<T> Index<usize> for LockedVector<T> {
type Output = T;
fn index(&self, index: usize) -> &Self::Output {
todo!()
}
}
impl<T> IndexMut<usize> for LockedVector<T> {
fn index_mut(&mut self, index: usize) -> &mut Self::Output {
let thing = self.stuff.get_mut().unwrap();
&mut thing[index]
}
}
impl<T: Display> Display for LockedVector<T> {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
let strings: Vec<String> = self
.stuff
.lock()
.unwrap()
.iter()
.map(|s| format!("{}", s))
.collect();
write!(f, "{}", strings.join(", "))
}
}
fn main() {
let mut my_stuff = LockedVector::new(vec![0, 1, 2, 3, 4]);
println!("initially: {}", my_stuff);
my_stuff[2] = 5;
println!("then: {}", my_stuff);
let a_mut_var: &mut usize = &mut my_stuff[3];
*a_mut_var = 54;
println!("Still working: {}", my_stuff);
}
What I'm trying to do here is implement the Index and IndexMut traits on a struct, where the data being indexed is behind a Mutex lock. My very fuzzy reasoning for why this should be possible is that the result of locking a mutex is sort-of like a reference, and it seems like you could map a reference onto another reference, or somehow make a sort of reference that wraps the entire lock but only de-references the specific index.
My much less fuzzy reasoning is that the code above compiles and runs (note the todo!) - I'm able to get back mutable references, and I assume I haven't somehow snuck past the mutex in an unthread-safe way. (I made an attempt to test the threaded behavior, but ran into other issues trying to get a mutable reference into another thread at all.)
The weird issue is, I can't do the same for Index - there is no get_immut() I can use, and I haven't found another approach. I can get a mutable reference out of Mutex, but not an immutable one (and of course, I can't get the mutable one if I only have an immutable reference to begin with)
My expectation is that indexing would acquire a lock, and the returned reference (in both mutable and immutable cases) would maintain the lock for their lifetimes. As a bonus, it would be nice if RwLock-ed things could only grab/hold the read lock for the immutable cases, and the write lock for mutable ones.
For context as to why I'd do this: I have a Grid trait that is used by a bunch of different code, but backed by different implementations, some of which are thread-safe. I was hoping to put the Index and IndexMut traits on it for the nice syntax. Threads don't generally have mutable references to the thread-safe Grids at all, so the IndexMut trait would see little use there, but I could see it being valuable during setup or for the non-thread-safe cases. The immutable Index behavior seems like it would be useful everywhere.
Bonus question: I absolutely hate that Display code, how can I make it less hideous?
If you look at the documentation of get_mut you'll see it's only possible precisely because a mutable reference ensures that there is no other reference or a lock to it, unfortunately for you that means that a get_ref for Mutex would only be possible by taking a mutable reference, that's just an artificially limited get_mut though.
Unfortunately for you since Index only gives you a shared reference you can't safely get a shared reference to it's contents, so you can't implement an Index so that it indexes into something behind a Mutex.
I have a struct and I want to call one of the struct's methods every time a mutable borrow to it has ended. To do so, I would need to know when the mutable borrow to it has been dropped. How can this be done?
Disclaimer: The answer that follows describes a possible solution, but it's not a very good one, as described by this comment from Sebastien Redl:
[T]his is a bad way of trying to maintain invariants. Mostly because dropping the reference can be suppressed with mem::forget. This is fine for RefCell, where if you don't drop the ref, you will simply eventually panic because you didn't release the dynamic borrow, but it is bad if violating the "fraction is in shortest form" invariant leads to weird results or subtle performance issues down the line, and it is catastrophic if you need to maintain the "thread doesn't outlive variables in the current scope" invariant.
Nevertheless, it's possible to use a temporary struct as a "staging area" that updates the referent when it's dropped, and thus maintain the invariant correctly; however, that version basically amounts to making a proper wrapper type and a kind of weird way to use it. The best way to solve this problem is through an opaque wrapper struct that doesn't expose its internals except through methods that definitely maintain the invariant.
Without further ado, the original answer:
Not exactly... but pretty close. We can use RefCell<T> as a model for how this can be done. It's a bit of an abstract question, but I'll use a concrete example to demonstrate. (This won't be a complete example, but something to show the general principles.)
Let's say you want to make a Fraction struct that is always in simplest form (fully reduced, e.g. 3/5 instead of 6/10). You write a struct RawFraction that will contain the bare data. RawFraction instances are not always in simplest form, but they have a method fn reduce(&mut self) that reduces them.
Now you need a smart pointer type that you will always use to mutate the RawFraction, which calls .reduce() on the pointed-to struct when it's dropped. Let's call it RefMut, because that's the naming scheme RefCell uses. You implement Deref<Target = RawFraction>, DerefMut, and Drop on it, something like this:
pub struct RefMut<'a>(&'a mut RawFraction);
impl<'a> Deref for RefMut<'a> {
type Target = RawFraction;
fn deref(&self) -> &RawFraction {
self.0
}
}
impl<'a> DerefMut for RefMut<'a> {
fn deref_mut(&mut self) -> &mut RawFraction {
self.0
}
}
impl<'a> Drop for RefMut<'a> {
fn drop(&mut self) {
self.0.reduce();
}
}
Now, whenever you have a RefMut to a RawFraction and drop it, you know the RawFraction will be in simplest form afterwards. All you need to do at this point is ensure that RefMut is the only way to get &mut access to the RawFraction part of a Fraction.
pub struct Fraction(RawFraction);
impl Fraction {
pub fn new(numerator: i32, denominator: i32) -> Self {
// create a RawFraction, reduce it and wrap it up
}
pub fn borrow_mut(&mut self) -> RefMut {
RefMut(&mut self.0)
}
}
Pay attention to the pub markings (and lack thereof): I'm using those to ensure the soundness of the exposed interface. All three types should be placed in a module by themselves. It would be incorrect to mark the RawFraction field pub inside Fraction, since then it would be possible (for code outside the module) to create an unreduced Fraction without using new or get a &mut RawFraction without going through RefMut.
Supposing all this code is placed in a module named frac, you can use it something like this (assuming Fraction implements Display):
let f = frac::Fraction::new(3, 10);
println!("{}", f); // prints 3/10
f.borrow_mut().numerator += 3;
println!("{}", f); // prints 3/5
The types encode the invariant: Wherever you have Fraction, you can know that it's fully reduced. When you have a RawFraction, &RawFraction, etc., you can't be sure. If you want, you may also make RawFraction's fields non-pub, so that you can't get an unreduced fraction at all except by calling borrow_mut on a Fraction.
Basically the same thing is done in RefCell. There you want to reduce the runtime borrow-count when a borrow ends. Here you want to perform an arbitrary action.
So let's re-use the concept of writing a function that returns a wrapped reference:
struct Data {
content: i32,
}
impl Data {
fn borrow_mut(&mut self) -> DataRef {
println!("borrowing");
DataRef { data: self }
}
fn check_after_borrow(&self) {
if self.content > 50 {
println!("Hey, content should be <= {:?}!", 50);
}
}
}
struct DataRef<'a> {
data: &'a mut Data
}
impl<'a> Drop for DataRef<'a> {
fn drop(&mut self) {
println!("borrow ends");
self.data.check_after_borrow()
}
}
fn main() {
let mut d = Data { content: 42 };
println!("content is {}", d.content);
{
let b = d.borrow_mut();
//let c = &d; // Compiler won't let you have another borrow at the same time
b.data.content = 123;
println!("content set to {}", b.data.content);
} // borrow ends here
println!("content is now {}", d.content);
}
This results in the following output:
content is 42
borrowing
content set to 123
borrow ends
Hey, content should be <= 50!
content is now 123
Be aware that you can still obtain an unchecked mutable borrow with e.g. let c = &mut d;. This will be silently dropped without calling check_after_borrow.
If I have a struct that encapsulates two members, and updates one based on the other, that's fine as long as I do it this way:
struct A {
value: i64
}
impl A {
pub fn new() -> Self {
A { value: 0 }
}
pub fn do_something(&mut self, other: &B) {
self.value += other.value;
}
pub fn value(&self) -> i64 {
self.value
}
}
struct B {
pub value: i64
}
struct State {
a: A,
b: B
}
impl State {
pub fn new() -> Self {
State {
a: A::new(),
b: B { value: 1 }
}
}
pub fn do_stuff(&mut self) -> i64 {
self.a.do_something(&self.b);
self.a.value()
}
pub fn get_b(&self) -> &B {
&self.b
}
}
fn main() {
let mut state = State::new();
println!("{}", state.do_stuff());
}
That is, when I directly refer to self.b. But when I change do_stuff() to this:
pub fn do_stuff(&mut self) -> i64 {
self.a.do_something(self.get_b());
self.a.value()
}
The compiler complains: cannot borrow `*self` as immutable because `self.a` is also borrowed as mutable.
What if I need to do something more complex than just returning a member in order to get the argument for a.do_something()? Must I make a function that returns b by value and store it in a binding, then pass that binding to do_something()? What if b is complex?
More importantly to my understanding, what kind of memory-unsafety is the compiler saving me from here?
A key aspect of mutable references is that they are guaranteed to be the only way to access a particular value while they exist (unless they're reborrowed, which "disables" them temporarily).
When you write
self.a.do_something(&self.b);
the compiler is able to see that the borrow on self.a (which is taken implicitly to perform the method call) is distinct from the borrow on self.b, because it can reason about direct field accesses.
However, when you write
self.a.do_something(self.get_b());
then the compiler doesn't see a borrow on self.b, but rather a borrow on self. That's because lifetime parameters on method signatures cannot propagate such detailed information about borrows. Therefore, the compiler cannot guarantee that the value returned by self.get_b() doesn't give you access to self.a, which would create two references that can access self.a, one of them being mutable, which is illegal.
The reason field borrows don't propagate across functions is to simplify type checking and borrow checking (for machines and for humans). The principle is that the signature should be sufficient for performing those tasks: changing the implementation of a function should not cause errors in its callers.
What if I need to do something more complex than just returning a member in order to get the argument for a.do_something()?
I would move get_b from State to B and call get_b on self.b. This way, the compiler can see the distinct borrows on self.a and self.b and will accept the code.
self.a.do_something(self.b.get_b());
Yes, the compiler isolates functions for the purposes of the safety checks it makes. If it didn't, then every function would essentially have to be inlined everywhere. No one would appreciate this for at least two reasons:
Compile times would go through the roof, and many opportunities for parallelization would have to be discarded.
Changes to a function N calls away could affect the current function. See also Why are explicit lifetimes needed in Rust? which touches on the same concept.
what kind of memory-unsafety is the compiler saving me from here
None, really. In fact, it could be argued that it's creating false positives, as your example shows.
It's really more of a benefit for preserving programmer sanity.
The general advice that I give and follow when I encounter this problem is that the compiler is guiding you to discovering a new type in your existing code.
Your particular example is a bit too simplified for this to make sense, but if you had struct Foo(A, B, C) and found that a method on Foo needed A and B, that's often a good sign that there's a hidden type composed of A and B: struct Foo(Bar, C); struct Bar(A, B).
This isn't a silver bullet as you can end up with methods that need each pair of data, but in my experience it works the majority of the time.
I'm trying to put in one structure information about inotify events and hashmap with inotify watch id as key and name of the file as value.
extern crate inotify;
use inotify::INotify;
use std::sync::{Arc, Mutex};
use std::collections::HashMap;
struct Notificator {
inotify: INotify,
watch_service: Arc<Mutex<HashMap<inotify::wrapper::Watch, Arc<String>>>>,
}
impl Notificator {
pub fn new() -> Notificator {
Notificator {
inotify: INotify::init().unwrap(),
watch_service: Arc::new(Mutex::new(HashMap::new())),
}
}
pub fn resolving_events(&mut self) {
{
let mut events = self.check_for_events();
self.events_execution(events);
}
}
fn check_for_events(&mut self) -> &[inotify::wrapper::Event] {
self.inotify.available_events().unwrap()
}
fn events_execution(&self, events: &[inotify::wrapper::Event]) {
for event in events.iter() {
}
}
}
During compilation I am receiving an error
src/main.rs:204:13: 204:17 error: cannot borrow `*self` as immutable because it is also borrowed as mutable [E0502]
src/main.rs:204 self.events_execution(events);
I thought the best solution would be to separate somehow inotify variable in Notificator structure with watch_service, but I can't dereference self.check_for_events(); because I receive
src/main.rs:203:17: 203:27 error: the trait bound `[inotify::wrapper::Event]: std::marker::Sized` is not satisfied [E0277]
src/main.rs:203 let mut events = *self.check_for_events();
I understand the core of the problem: I'm trying to borrow reference by check_for_events and then using it as parameter in events_execution which also requires self as parameter, but I have no idea how to resolve it.
One thing you could do, although it's not very elegant, is to have your mutable method consume its mutable borrow and return an immutable one that you can then use:
pub fn resolving_events(&mut self) {
let (slf, events) = self.check_for_events();
slf.events_execution(events);
}
fn check_for_events(&mut self) -> (&Self, &[inotify::wrapper::Event]) {
let events = self.inotify.available_events().unwrap();
(&self, events)
}
I've made a small proof-of-concept on the playground (using vecs of u64 as the mutable state, but the principle is similar). It might be cleaner to refactor your code so that some external client can (mutably) borrow the Notifier, produce the events, release the borrow, and borrow it (immutably) to process them...
This is a known issue in the borrow checker (more discussion on internals). You cannot have a function taking a &mut T to an object and returning a &T without losing the ability to access the object again until the &T goes out of scope. You can't work around it due to the way inotify is implemented.
But you can ask the inotify authors to create a get_available_notifications method that doesn't fetch new data. This way you can call available_notifications once and drop the returned value. Then call get_available_notifications (which doesn't take &mut INotify, but just &INotify) and work from there.
I feel like rc::Weak could use a (sort of) AsRef trait implementation. I'm trying to borrow some shared content from a weak pointer, but this won't compile:
use std::rc::Weak;
struct Thing<T>(Weak<T>);
impl<T> Thing<T> {
fn as_ref(&self) -> Option<&T> {
self.0.upgrade().map(|rc| {
rc.as_ref()
})
}
// For clarity, without a confusing closure
fn unwrapped_as_ref(&self) -> &T {
self.0.upgrade().unwrap().as_ref()
}
}
I understand why: the upgraded Rc does not survive the as_ref call. However it seems to me that it is perfectly sound. A possible magic trick using unsafe that does compile:
impl<T> Thing<T> {
fn unwrapped_as_ref<'a>(&'a self) -> &'a T {
let rc = self.0.upgrade().unwrap();
unsafe {
std::mem::transmute(rc.as_ref())
}
}
}
So:
Are there any downsides to this solution? Is it sound? Can you think of a simpler alternative?
Would it make sense to implement a as_ref(&self) -> Option<&T> in the standard library?
You can’t borrow from a weak reference, you just can’t. It’s weak, it does not guarantee that the underlying object exists (that’s why upgrade() returns an Option). And even if you were lucky and the value was still alive at the point you accessed it through the weak reference (upgrade() returned Some), it can be freed the next moment, as soon as the upgraded reference goes out of scope.
In order to get a reference to the underlying value you need something that will keep it alive (e.g. a strong reference), but this means you’ll have to return it along with the reference.