I feel like rc::Weak could use a (sort of) AsRef trait implementation. I'm trying to borrow some shared content from a weak pointer, but this won't compile:
use std::rc::Weak;
struct Thing<T>(Weak<T>);
impl<T> Thing<T> {
fn as_ref(&self) -> Option<&T> {
self.0.upgrade().map(|rc| {
rc.as_ref()
})
}
// For clarity, without a confusing closure
fn unwrapped_as_ref(&self) -> &T {
self.0.upgrade().unwrap().as_ref()
}
}
I understand why: the upgraded Rc does not survive the as_ref call. However it seems to me that it is perfectly sound. A possible magic trick using unsafe that does compile:
impl<T> Thing<T> {
fn unwrapped_as_ref<'a>(&'a self) -> &'a T {
let rc = self.0.upgrade().unwrap();
unsafe {
std::mem::transmute(rc.as_ref())
}
}
}
So:
Are there any downsides to this solution? Is it sound? Can you think of a simpler alternative?
Would it make sense to implement a as_ref(&self) -> Option<&T> in the standard library?
You can’t borrow from a weak reference, you just can’t. It’s weak, it does not guarantee that the underlying object exists (that’s why upgrade() returns an Option). And even if you were lucky and the value was still alive at the point you accessed it through the weak reference (upgrade() returned Some), it can be freed the next moment, as soon as the upgraded reference goes out of scope.
In order to get a reference to the underlying value you need something that will keep it alive (e.g. a strong reference), but this means you’ll have to return it along with the reference.
Related
I've got a struct that contains some locked data. The real world is complex, but here's a minimal example (or as minimal as I can make it):
use std::fmt::Display;
use std::ops::{Index, IndexMut};
use std::sync::Mutex;
struct LockedVector<T> {
stuff: Mutex<Vec<T>>,
}
impl<T> LockedVector<T> {
pub fn new(v: Vec<T>) -> Self {
LockedVector {
stuff: Mutex::new(v),
}
}
}
impl<T> Index<usize> for LockedVector<T> {
type Output = T;
fn index(&self, index: usize) -> &Self::Output {
todo!()
}
}
impl<T> IndexMut<usize> for LockedVector<T> {
fn index_mut(&mut self, index: usize) -> &mut Self::Output {
let thing = self.stuff.get_mut().unwrap();
&mut thing[index]
}
}
impl<T: Display> Display for LockedVector<T> {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
let strings: Vec<String> = self
.stuff
.lock()
.unwrap()
.iter()
.map(|s| format!("{}", s))
.collect();
write!(f, "{}", strings.join(", "))
}
}
fn main() {
let mut my_stuff = LockedVector::new(vec![0, 1, 2, 3, 4]);
println!("initially: {}", my_stuff);
my_stuff[2] = 5;
println!("then: {}", my_stuff);
let a_mut_var: &mut usize = &mut my_stuff[3];
*a_mut_var = 54;
println!("Still working: {}", my_stuff);
}
What I'm trying to do here is implement the Index and IndexMut traits on a struct, where the data being indexed is behind a Mutex lock. My very fuzzy reasoning for why this should be possible is that the result of locking a mutex is sort-of like a reference, and it seems like you could map a reference onto another reference, or somehow make a sort of reference that wraps the entire lock but only de-references the specific index.
My much less fuzzy reasoning is that the code above compiles and runs (note the todo!) - I'm able to get back mutable references, and I assume I haven't somehow snuck past the mutex in an unthread-safe way. (I made an attempt to test the threaded behavior, but ran into other issues trying to get a mutable reference into another thread at all.)
The weird issue is, I can't do the same for Index - there is no get_immut() I can use, and I haven't found another approach. I can get a mutable reference out of Mutex, but not an immutable one (and of course, I can't get the mutable one if I only have an immutable reference to begin with)
My expectation is that indexing would acquire a lock, and the returned reference (in both mutable and immutable cases) would maintain the lock for their lifetimes. As a bonus, it would be nice if RwLock-ed things could only grab/hold the read lock for the immutable cases, and the write lock for mutable ones.
For context as to why I'd do this: I have a Grid trait that is used by a bunch of different code, but backed by different implementations, some of which are thread-safe. I was hoping to put the Index and IndexMut traits on it for the nice syntax. Threads don't generally have mutable references to the thread-safe Grids at all, so the IndexMut trait would see little use there, but I could see it being valuable during setup or for the non-thread-safe cases. The immutable Index behavior seems like it would be useful everywhere.
Bonus question: I absolutely hate that Display code, how can I make it less hideous?
If you look at the documentation of get_mut you'll see it's only possible precisely because a mutable reference ensures that there is no other reference or a lock to it, unfortunately for you that means that a get_ref for Mutex would only be possible by taking a mutable reference, that's just an artificially limited get_mut though.
Unfortunately for you since Index only gives you a shared reference you can't safely get a shared reference to it's contents, so you can't implement an Index so that it indexes into something behind a Mutex.
I'm developing for a single-core embedded chip. In C & C++ it's common to statically-define mutable values that can be used globally. The Rust equivalent is roughly this:
static mut MY_VALUE: usize = 0;
pub fn set_value(val: usize) {
unsafe { MY_VALUE = val }
}
pub fn get_value() -> usize {
unsafe { MY_VALUE }
}
Now anywhere can call the free functions get_value and set_value.
I think that this should be entirely safe in single-threaded embedded Rust, but I've not been able to find a definitive answer. I'm only interested in types that don't require allocation or destruction (like the primitive in the example here).
The only gotcha I can see is with the compiler or processor reordering accesses in unexpected ways (which could be solves using the volatile access methods), but is that unsafe per se?
Edit:
The book suggests that this is safe so long as we can guarantee no multi-threaded data races (obviously the case here)
With mutable data that is globally accessible, it’s difficult to ensure there are no data races, which is why Rust considers mutable static variables to be unsafe.
The docs are phrased less definitively, suggesting that data races are only one way this can be unsafe but not expanding on other examples
accessing mutable statics can cause undefined behavior in a number of ways, for example due to data races in a multithreaded context
The nomicon suggests that this should be safe so long as you don't somehow dereference a bad pointer.
Be aware as there is no such thing as single-threaded code as long as interrupts are enabled. So even for microcontrollers, mutable statics are unsafe.
If you really can guarantee single-threaded access, your assumption is correct that accessing primitive types should be safe. That's why the Cell type exists, which allows mutability of primitive types with the exception that it is not Sync (meaning it explicitely prevents threaded access).
That said, to create a safe static variable, it needs to implement Sync for exactly the reason mentioned above; which Cell doesn't do, for obvious reasons.
To actually have a mutable global variable with a primitive type without using an unsafe block, I personally would use an Atomic. Atomics do not allocate and are available in the core library, meaning they work on microcontrollers.
use core::sync::atomic::{AtomicUsize, Ordering};
static MY_VALUE: AtomicUsize = AtomicUsize::new(0);
pub fn set_value(val: usize) {
MY_VALUE.store(val, Ordering::Relaxed)
}
pub fn get_value() -> usize {
MY_VALUE.load(Ordering::Relaxed)
}
fn main() {
println!("{}", get_value());
set_value(42);
println!("{}", get_value());
}
Atomics with Relaxed are zero-overhead on almost all architectures.
In this case it's not unsound, but you still should avoid it because it is too easy to misuse it in a way that is UB.
Instead, use a wrapper around UnsafeCell that is Sync:
pub struct SyncCell<T>(UnsafeCell<T>);
unsafe impl<T> Sync for SyncCell<T> {}
impl<T> SyncCell<T> {
pub const fn new(v: T) -> Self { Self(UnsafeCell::new(v)); }
pub unsafe fn set(&self, v: T) { *self.0.get() = v; }
}
impl<T: Copy> SyncCell<T> {
pub unsafe fn get(&self) -> T { *self.0.get() }
}
If you use nightly, you can use SyncUnsafeCell.
Mutable statics are unsafe in general because they circumvent the normal borrow checker rules that enforce either exactly 1 mutable borrow exists or any number of immutable borrows exist (including 0), which allows you to write code which causes undefined behavior. For instance, the following compiles and prints 2 2:
static mut COUNTER: i32 = 0;
fn main() {
unsafe {
let mut_ref1 = &mut COUNTER;
let mut_ref2 = &mut COUNTER;
*mut_ref1 += 1;
*mut_ref2 += 1;
println!("{mut_ref1} {mut_ref2}");
}
}
However we have two mutable references to the same location in memory existing concurrently, which is UB.
I believe the code that you posted there is safe, but I generally would not recommend using static mut. Use an atomic, SyncUnsafeCell/UnsafeCell, a wrapper around a Cell that implements Sync which is safe since your environment is single-threaded, or honestly just about anything else. static mut is wildly unsafe and its use is highly discouraged.
In order to sidestep the issue of exactly how mutable statics can be used safely in single-threaded code, another option is to use thread-local storage:
use std::cell::Cell;
thread_local! (static MY_VALUE: Cell<usize> = {
Cell::new(0)
});
pub fn set_value(val: usize) {
MY_VALUE.with(|cell| cell.set(val))
}
pub fn get_value() -> usize {
MY_VALUE.with(|cell| cell.get())
}
I have a data structure, that is somewhat like a RwLock. It is not re-entrant, but the actual locking mechanism is a const function. Is there any way that I can mark this function as "exclusive_borrow" without switching it to be be a mutable function. That way multiple calls to 'read' will be caught at compile time instead of panicking.
struct MyRwLock<T> {
t: T,
}
impl MyRwLock {
// Works fine, but doesn't enforce on compile time that there is
// only 1 Guard.
pub fn read(&self) -> ReadGuard<'_, T> { ... }
// Enforces only 1 ReadGuard at compile time, but unnecessarily
// requires MyMutex to be mutable to read.
pub fn mut_read(&mut self) -> ReadGuard<'_, T> { ... }
}
&mut is a bit of a misnomer. It actually means exclusive reference, not mutable. If that's what you want, then it's correct to use &mut.
In fact, there was a proposal to rename &mut back in 2014. It never came through, but you might occasionally hear whispers of the "mutpocalypse" today.
I have a struct that is not Send because it contains Rc. Lets say that Arc has too big overhead, so I want to keep using Rc. I would still like to occasionally Send this struct between threads, but only when I can verify that the Rc has strong_count 1 and weak_count 0.
Here is (hopefully safe) abstraction that I have in mind:
mod my_struct {
use std::rc::Rc;
#[derive(Debug)]
pub struct MyStruct {
reference_counted: Rc<String>,
// more fields...
}
impl MyStruct {
pub fn new() -> Self {
MyStruct {
reference_counted: Rc::new("test".to_string())
}
}
pub fn pack_for_sending(self) -> Result<Sendable, Self> {
if Rc::strong_count(&self.reference_counted) == 1 &&
Rc::weak_count(&self.reference_counted) == 0
{
Ok(Sendable(self))
} else {
Err(self)
}
}
// There are more methods, some may clone the `Rc`!
}
/// `Send`able wrapper for `MyStruct` that does not allow you to access it,
/// only unpack it.
pub struct Sendable(MyStruct);
// Safety: `MyStruct` is not `Send` because of `Rc`. `Sendable` can be
// only created when the `Rc` has strong count 1 and weak count 0.
unsafe impl Send for Sendable {}
impl Sendable {
/// Retrieve the inner `MyStruct`, making it not-sendable again.
pub fn unpack(self) -> MyStruct {
self.0
}
}
}
use crate::my_struct::MyStruct;
fn main() {
let handle = std::thread::spawn(|| {
let my_struct = MyStruct::new();
dbg!(&my_struct);
// Do something with `my_struct`, but at the end the inner `Rc` should
// not be shared with anybody.
my_struct.pack_for_sending().expect("Some Rc was still shared!")
});
let my_struct = handle.join().unwrap().unpack();
dbg!(&my_struct);
}
I did a demo on the Rust playground.
It works. My question is, is it actually safe?
I know that the Rc is owned only by a single onwer and nobody can change that under my hands, because it can't be accessed by other threads and we wrap it into Sendable which does not allow access to the contained value.
But in some crazy world Rc could for example internally use thread local storage and this would not be safe... So is there some guarantee that I can do this?
I know that I must be extremely careful to not introduce some additional reason for the MyStruct to not be Send.
No.
There are multiple points that need to be verified to be able to send Rc across threads:
There can be no other handle (Rc or Weak) sharing ownership.
The content of Rc must be Send.
The implementation of Rc must use a thread-safe strategy.
Let's review them in order!
Guaranteeing the absence of aliasing
While your algorithm -- checking the counts yourself -- works for now, it would be better to simply ask Rc whether it is aliased or not.
fn is_aliased<T>(t: &mut Rc<T>) -> bool { Rc::get_mut(t).is_some() }
The implementation of get_mut will be adjusted should the implementation of Rc change in ways you have not foreseen.
Sendable content
While your implementation of MyStruct currently puts String (which is Send) into Rc, it could tomorrow change to Rc<str>, and then all bets are off.
Therefore, the sendable check needs to be implemented at the Rc level itself, otherwise you need to audit any change to whatever Rc holds.
fn sendable<T: Send>(mut t: Rc<T>) -> Result<Rc<T>, ...> {
if !is_aliased(&mut t) {
Ok(t)
} else {
...
}
}
Thread-safe Rc internals
And that... cannot be guaranteed.
Since Rc is not Send, its implementation can be optimized in a variety of ways:
The entire memory could be allocated using a thread-local arena.
The counters could be allocated using a thread-local arena, separately, so as to seamlessly convert to/from Box.
...
This is not the case at the moment, AFAIK, however the API allows it, so the next release could definitely take advantage of this.
What should you do?
You could make pack_for_sending unsafe, and dutifully document all assumptions that are counted on -- I suggest using get_mut to remove one of them. Then, on each new release of Rust, you'd have to double-check each assumption to ensure that your usage if still safe.
Or, if you do not mind making an allocation, you could write a conversion to Arc<T> yourself (see Playground):
fn into_arc<T>(this: Rc<T>) -> Result<Arc<T>, Rc<T>> {
Rc::try_unwrap(this).map(|t| Arc::new(t))
}
Or, you could write a RFC proposing a Rc <-> Arc conversion!
The API would be:
fn Rc<T: Send>::into_arc(this: Self) -> Result<Arc<T>, Rc<T>>
fn Arc<T>::into_rc(this: Self) -> Result<Rc<T>, Arc<T>>
This could be made very efficiently inside std, and could be of use to others.
Then, you'd convert from MyStruct to MySendableStruct, just moving the fields and converting Rc to Arc as you go, send to another thread, then convert back to MyStruct.
And you would not need any unsafe...
The only difference between Arc and Rc is that Arc uses atomic counters. The counters are only accessed when the pointer is cloned or dropped, so the difference between the two is negligible in applications which just share pointers between long-lived threads.
If you have never cloned the Rc, it is safe to send between threads. However, if you can guarantee that the pointer is unique then you can make the same guarantee about a raw value, without using a smart pointer at all!
This all seems quite fragile, for little benefit; future changes to the code might not meet your assumptions, and you will end up with Undefined Behaviour. I suggest that you at least try making some benchmarks with Arc. Only consider approaches like this when you measure a performance problem.
You might also consider using the archery crate, which provides a reference-counted pointer that abstracts over atomicity.
This question already has answers here:
Is there any way to return a reference to a variable created in a function?
(5 answers)
Closed 4 years ago.
I have the following function as part of a Rust WASM application to convert a Boxed closure into the Rust-representation for a JavaScript function.
use js_sys::Function;
type Callback = Rc<RefCell<Option<Closure<FnMut()>>>>;
fn to_function(callback: &Callback) -> &Function {
callback.borrow().as_ref().unwrap().as_ref().unchecked_ref()
}
However, the compiler complains that the return value uses a borrowed value (obtained with callback.borrow()) so cannot be returned.
Hence, I decided to add lifetime annotations to inform the compiler that this new reference should live as long as the input.
use js_sys::Function;
type Callback = Rc<RefCell<Option<Closure<FnMut()>>>>;
fn to_function<'a>(callback: &'a Callback) -> &'a Function {
callback.borrow().as_ref().unwrap().as_ref().unchecked_ref()
}
Unfortunately, this hasn't helped and I get the same error. What am I doing wrong here?
Yeah, this isn't going to work.
callback.borrow().as_ref().unwrap().as_ref().unchecked_ref()
Let's break this down in steps:
You're borrowing &RefCell<Option<Closure<FnMut()>>> - so you now have Ref<Option<...>>, which is step #1 of your issues. When this happens, this intermediary value now has a different lifetime than 'a (inferior, to be precise). Anything stemming from this will inherit this lesser lifetime. Call it 'b for now
You then as_ref this Ref, turning it into Option<&'b Closure<FnMut()>>
Rust then converts &'b Closure<FnMut()> into &'b Function
Step 1 is where the snafu happens. Due to the lifetime clash, you're left with this mess. A semi-decent way to solve it the following construct:
use std::rc::{Rc};
use std::cell::{RefCell, Ref};
use std::ops::Deref;
struct CC<'a, T> {
inner: &'a Rc<RefCell<T>>,
borrow: Ref<'a, T>
}
impl<'a, T> CC<'a, T> {
pub fn from_callback(item:&'a Rc<RefCell<T>>) -> CC<'a, T> {
CC {
inner: item,
borrow: item.borrow()
}
}
pub fn to_function(&'a self) -> &'a T {
self.borrow.deref()
}
}
It's a bit unwieldy, but it's probably the cleanest way to do so.
A new struct CC is defined, containing a 'a ref to Rc<RefCell<T>> (where the T generic in your case would end up being Option<Closure<FnMut()>>) and a Ref to T with lifetime 'a, auto-populated on the from_callback constructor.
The moment you generate this object, you'll have a Ref with the same lifetime as the ref you gave as an argument, making the entire issue go away. From there, you can call to_function to retrieve a &'a reference to your inner type.
There is a gotcha to this: as long as a single of these objects exists, you will (obviously) not be able to borrow_mut() on the RefCell, which may or may not kill your use case (as one doesn't use a RefCell for the fun of it). Nevertheless, these objects are relatively cheap to instantiate, so you can afford to bin them once you're done with them.
An example with Function and Closure types replaced with u8 (because js_sys cannot be imported into the sandbox) is available here.
Although I really like Sébastien's answer and explanation, I ended up going for Ömer's suggestion of using a macro, simply for the sake of conciseness. I'll post the macro in case it's of use to anyone else.
macro_rules! callback_to_function {
($callback:expr) => {
$callback
.borrow()
.as_ref()
.unwrap()
.as_ref()
.unchecked_ref()
};
}
I'll leave Sébastien's answer as the accepted one as I believe it is the more "correct" way to solve this issue and he provides a great explanation.