I'm building an interpreter with a garbage collector. I want a thread-local nursery region, and a shared older region. I am having trouble setting up the nursery. I have:
const NurserySize : usize = 25000;
#[thread_local]
static mut NurseryMemory : [usize;NurserySize] = [0;NurserySize];
thread_local! {
static Nursery: AllocableRegion = AllocableRegion::makeLocal(unsafe{&mut NurseryMemory});
}
#[cfg(test)]
mod testMemory {
use super::*;
#[test]
fn test1() {
Nursery.with(|n| n.allocObject(10));
}
}
First question is why do I need the unsafe - NurseryMemory is thread local, so access can't be unsafe?
Second question is how can I actually use this? The code is at playground, but it doesn't compile and attempts I've made to fix it seem to make it worse.
1. Why is unsafe required to get a reference to a mutable ThreadLocal?
The same reason that you need unsafe for a normal mutable static,
you would be able to create aliasing mut pointers in safe code.
The following incorrectly creates two mutable references to the mutable thread local.
#![feature(thread_local)]
#[thread_local]
static mut SomeValue: Result<&str, usize> = Ok("hello world");
pub fn main() {
let first = unsafe {&mut SomeValue};
let second = unsafe {&mut SomeValue};
if let Ok(string) = first {
*second = Err(0); // should invalidate the string reference, but it doesn't
println!("{}", string) // as first and second are considered to be disjunct
}
}
first wouldn't even need to be a mutable reference for this to be a problem.
2. How to fix the code?
You could use a RefCell around the AllocatableRegion to dynamically enforce the borrowing of the inner value.
const NurserySize : usize = 25000;
#[thread_local]
static mut NurseryMemory : [usize;NurserySize] = [0;NurserySize];
thread_local! {
static Nursery: RefCell<AllocableRegion> = RefCell::new(AllocableRegion::makeLocal(unsafe{&mut NurseryMemory}));
}
#[cfg(test)]
mod testMemory {
use super::*;
#[test]
fn test1() {
Nursery.with(|n| n.borrow_mut().allocObject(10));
}
}
You don't need unsafe to make a mutable thread-local struct in rust. However, Nursery does need to be a RefCell. This is sufficient:
use std::cell::RefCell;
const NURSERY_SIZE: usize = 300;
thread_local! {
static NURSERY: RefCell<[usize; NURSERY_SIZE]> = RefCell::new([0; NURSERY_SIZE]);
}
#[cfg(test)]
mod test_memory {
use super::*;
#[test]
fn test1() {
NURSERY.with(|n| n.borrow_mut()[10] = 20);
}
}
Rust Playground link
Related
I want to write a FFI wrapper for sn_api library, which contains async functions. It will be used in single-threaded non-async code written in Red.
I found, that the easy way is to use Runtime::new().unwrap().block_on(...) in every exported function, although it involves a lot of creating new Tokio runtimes and seem to be too heavy to be run on every call:
use std::os::raw::c_char;
use std::ffi::{CString, CStr};
use sn_api::{BootstrapConfig, Safe};
use tokio::runtime::Runtime;
#[no_mangle]
pub extern "C" _safe_connect(ptr: *const Safe, bootstrap_contact: *const c_char) {
assert!(!ptr.is_null());
let _safe = unsafe {
&*ptr
};
let bootstrap_contact = unsafe {
CStr::from_ptr(bootstrap_contact)
}
let mut bootstrap_contacts = BootstrapConfig::default();
bootstrap_contacts.insert(bootstrap_contact.parse().expect("Invalid bootstrap address"));
// how to reuse the Runtime in other functions?
Runtime::new().unwrap().block_on(_safe.connect(None, None, Some(bootstrap_contacts)));
}
Is it possible to run all async functions on a common Runtime? I imagine it would require creating some singleton / global, but my library is compiled with crate-type = ["cdylib"], which seems not a good place for globals. What would be the best approach?
I've decided for an approach, where I create a Tokio Runtime, and then pass it to every FFI function call containing async code:
use std::os::raw::c_char;
use std::ffi::{CString, CStr};
use sn_api::{BootstrapConfig, Safe};
use tokio::runtime::Runtime;
#[no_mangle]
pub extern "C" fn init_runtime() -> *mut Runtime {
Box::into_raw(Box::new(Runtime::new().unwrap()))
}
#[no_mangle]
pub extern "C" _safe_connect(rt_ptr: *mut Runtime, safe_ptr: *mut Safe, bootstrap_contact: *const c_char) {
assert!(!safe_ptr.is_null());
assert!(!rt_ptr.is_null());
let bootstrap_contact = unsafe {
CStr::from_ptr(bootstrap_contact)
}
let mut bootstrap_contacts = BootstrapConfig::default();
bootstrap_contacts.insert(bootstrap_contact.parse().expect("Invalid bootstrap address"));
unsafe {
let _safe = &mut *safe_ptr;
let rt = &mut *rt_ptr;
rt.block_on(_safe.connect(None, None, Some(bootstrap_contacts))).unwrap();
}
}
I faced the same issue. Here is my cut: export-tokio-to-lib.
plugin.rs:
use async_ffi::{FfiFuture, FutureExt};
use tokio::runtime::Handle;
/// # Safety
#[no_mangle]
pub unsafe extern "C" fn test(arg: f32, handle: *const Handle) -> FfiFuture<safer_ffi::String> {
let handle = &*handle;
async move {
let _enter = handle.enter();
tokio::time::sleep(std::time::Duration::from_secs_f32(arg)).await;
format!("slept {arg} secs").into()
}
.into_ffi()
}
Try this.
From this:
#[tokio::main]
async fn main() {
println!("hello");
}
Transformed into:
fn main() {
let mut rt = tokio::runtime::Runtime::new().unwrap();
rt.block_on(async {
println!("hello");
})
}
Reference: https://tokio.rs/tokio/tutorial/hello-tokio#async-main-function
This is something of a controversial topic, so let me start by explaining my use case, and then talk about the actual problem.
I find that for a bunch of unsafe things, it's important to make sure that you don't leak memory; this is actually quite easy to do if you start using transmute() and forget(). For example, passing a boxed instance to C code for an arbitrary amount of time, then fetching it back out and 'resurrecting it' by using transmute.
Imagine I have a safe wrapper for this sort of API:
trait Foo {}
struct CBox;
impl CBox {
/// Stores value in a bound C api, forget(value)
fn set<T: Foo>(value: T) {
// ...
}
/// Periodically call this and maybe get a callback invoked
fn poll(_: Box<Fn<(EventType, Foo), ()> + Send>) {
// ...
}
}
impl Drop for CBox {
fn drop(&mut self) {
// Safely load all saved Foo's here and discard them, preventing memory leaks
}
}
To test this is actually not leaking any memory, I want some tests like this:
#[cfg(test)]
mod test {
struct IsFoo;
impl Foo for IsFoo {}
impl Drop for IsFoo {
fn drop(&mut self) {
Static::touch();
}
}
#[test]
fn test_drops_actually_work() {
guard = Static::lock(); // Prevent any other use of Static concurrently
Static::reset(); // Set to zero
{
let c = CBox;
c.set(IsFoo);
c.set(IsFoo);
c.poll(/*...*/);
}
assert!(Static::get() == 2); // Assert that all expected drops were invoked
guard.release();
}
}
How can you create this type of static singleton object?
It must use a Semaphore style guard lock to ensure that multiple tests do not concurrently run, and then unsafely access some kind of static mutable value.
I thought perhaps this implementation would work, but practically speaking it fails because occasionally race conditions result in a duplicate execution of init:
/// Global instance
static mut INSTANCE_LOCK: bool = false;
static mut INSTANCE: *mut StaticUtils = 0 as *mut StaticUtils;
static mut WRITE_LOCK: *mut Semaphore = 0 as *mut Semaphore;
static mut LOCK: *mut Semaphore = 0 as *mut Semaphore;
/// Generate instances if they don't exist
unsafe fn init() {
if !INSTANCE_LOCK {
INSTANCE_LOCK = true;
INSTANCE = transmute(box StaticUtils::new());
WRITE_LOCK = transmute(box Semaphore::new(1));
LOCK = transmute(box Semaphore::new(1));
}
}
Note specifically that unlike a normal program where you can be certain that your entry point (main) is always running in a single task, the test runner in Rust does not offer any kind of single entry point like this.
Other, obviously, than specifying the maximum number of tasks; given dozens of tests, only a handful need to do this sort of thing, and it's slow and pointless to limit the test task pool to one just for this one case.
It looks like a use case for std::sync::Once:
use std::sync::{Once, ONCE_INIT};
static INIT: Once = ONCE_INIT;
Then in your tests call
INIT.doit(|| unsafe { init(); });
Once guarantees that your init will only be executed once, no matter how many times you call INIT.doit().
See also lazy_static, which makes things a little more ergonomic. It does essentially the same thing as a static Once for each variable, but wraps it in a type that implements Deref so that you can access it like a normal reference.
Usage looks like this (from the documentation):
#[macro_use]
extern crate lazy_static;
use std::collections::HashMap;
lazy_static! {
static ref HASHMAP: HashMap<u32, &'static str> = {
let mut m = HashMap::new();
m.insert(0, "foo");
m.insert(1, "bar");
m.insert(2, "baz");
m
};
static ref COUNT: usize = HASHMAP.len();
static ref NUMBER: u32 = times_two(21);
}
fn times_two(n: u32) -> u32 { n * 2 }
fn main() {
println!("The map has {} entries.", *COUNT);
println!("The entry for `0` is \"{}\".", HASHMAP.get(&0).unwrap());
println!("A expensive calculation on a static results in: {}.", *NUMBER);
}
Note that autoderef means that you don't even have to use * whenever you call a method on your static variable. The variable will be initialized the first time it's Deref'd.
However, lazy_static variables are immutable (since they're behind a reference). If you want a mutable static, you'll need to use a Mutex:
lazy_static! {
static ref VALUE: Mutex<u64>;
}
impl Drop for IsFoo {
fn drop(&mut self) {
let mut value = VALUE.lock().unwrap();
*value += 1;
}
}
#[test]
fn test_drops_actually_work() {
// Have to drop the mutex guard to unlock, so we put it in its own scope
{
*VALUE.lock().unwrap() = 0;
}
{
let c = CBox;
c.set(IsFoo);
c.set(IsFoo);
c.poll(/*...*/);
}
assert!(*VALUE.lock().unwrap() == 2); // Assert that all expected drops were invoked
}
If you're willing to use nightly Rust you can use SyncLazy instead of the external lazy_static crate:
#![feature(once_cell)]
use std::collections::HashMap;
use std::lazy::SyncLazy;
static HASHMAP: SyncLazy<HashMap<i32, String>> = SyncLazy::new(|| {
println!("initializing");
let mut m = HashMap::new();
m.insert(13, "Spica".to_string());
m.insert(74, "Hoyten".to_string());
m
});
fn main() {
println!("ready");
std::thread::spawn(|| {
println!("{:?}", HASHMAP.get(&13));
}).join().unwrap();
println!("{:?}", HASHMAP.get(&74));
// Prints:
// ready
// initializing
// Some("Spica")
// Some("Hoyten")
}
I want to have a structure on the heap with two references; one for me and another from a closure. Note that the code is for the single-threaded case:
use std::rc::Rc;
#[derive(Debug)]
struct Foo {
val: u32,
}
impl Foo {
fn set_val(&mut self, val: u32) {
self.val = val;
}
}
impl Drop for Foo {
fn drop(&mut self) {
println!("we drop {:?}", self);
}
}
fn need_callback(mut cb: Box<FnMut(u32)>) {
cb(17);
}
fn create() -> Rc<Foo> {
let rc = Rc::new(Foo { val: 5 });
let weak_rc = Rc::downgrade(&rc);
need_callback(Box::new(move |x| {
if let Some(mut rc) = weak_rc.upgrade() {
if let Some(foo) = Rc::get_mut(&mut rc) {
foo.set_val(x);
}
}
}));
rc
}
fn main() {
create();
}
In the real code, need_callback saves the callback to some place, but before that may call cb as need_callback does.
The code shows that std::rc::Rc is not suitable for this task because foo.set_val(x) is never called; I have two strong references and Rc::get_mut gives None in this case.
What smart pointer with reference counting should I use instead of std::rc::Rc to make it possible to call foo.set_val? Maybe it is possible to fix my code and still use std::rc::Rc?
After some thinking, I need something like std::rc::Rc, but weak references should prevent dropping. I can have two weak references and upgrade them to strong when I need mutability.
Because it is a singled-threaded program, I will have only strong reference at a time, so everything will work as expected.
Rc (and its multithreaded counterpart Arc) only concern themselves with ownership. Instead of a single owner, there is now joint ownership, tracked at runtime.
Mutability is a different concept, although closely related to ownership: if you own a value, then you have the ability to mutate it. This is why Rc::get_mut only works when there is a single strong reference - it's the same as saying there is a single owner.
If you need the ability to divide mutability in a way that doesn't match the structure of the program, you can use tools like Cell or RefCell for single-threaded programs:
use std::cell::RefCell;
fn create() -> Rc<RefCell<Foo>> {
let rc = Rc::new(RefCell::new(Foo { val: 5 }));
let weak_rc = Rc::downgrade(&rc);
need_callback(move |x| {
if let Some(rc) = weak_rc.upgrade() {
rc.borrow_mut().set_val(x);
}
});
rc
}
Or Mutex, RwLock, or an atomic type in multithreaded contexts:
use std::sync::Mutex;
fn create() -> Rc<Mutex<Foo>> {
let rc = Rc::new(Mutex::new(Foo { val: 5 }));
let weak_rc = Rc::downgrade(&rc);
need_callback(move |x| {
if let Some(rc) = weak_rc.upgrade() {
if let Ok(mut foo) = rc.try_lock() {
foo.set_val(x);
}
}
});
rc
}
These tools all defer the check that there is only a single mutable reference to runtime, instead of compile time.
I'm writing a system where I have a collection of Objects, and each Object has a unique integral ID. Here's how I would do it in C++:
class Object {
public:
Object(): id_(nextId_++) { }
private:
int id_;
static int nextId_;
}
int Object::nextId_ = 1;
This is obviously not thread_safe, but if I wanted it to be, I could make nextId_ an std::atomic_int, or wrap a mutex around the nextId_++ expression.
How would I do this in (preferably safe) Rust? There's no static struct members, nor are global mutable variables safe. I could always pass nextId into the new function, but these objects are going to be allocated in a number of places, and I would prefer not to pipe the nextId number hither and yon. Thoughts?
Atomic variables can live in statics, so you can use it relatively straightforwardly (the downside is that you have global state).
Example code: (playground link)
use std::{
sync::atomic::{AtomicUsize, Ordering},
thread,
};
static OBJECT_COUNTER: AtomicUsize = AtomicUsize::new(0);
#[derive(Debug)]
struct Object(usize);
impl Object {
fn new() -> Self {
Object(OBJECT_COUNTER.fetch_add(1, Ordering::SeqCst))
}
}
fn main() {
let threads = (0..10)
.map(|_| thread::spawn(|| Object::new()))
.collect::<Vec<_>>();
for t in threads {
println!("{:?}", t.join().unwrap());
}
}
nor are global mutable variables safe
Your C++ example seems like it would have thread-safety issues, but I don't know enough C++ to be sure.
However, only unsynchronized global mutable variables are trouble. If you don't care about cross-thread issues, you can use a thread-local:
use std::cell::Cell;
#[derive(Debug)]
struct Monster {
id: usize,
health: u8,
}
thread_local!(static MONSTER_ID: Cell<usize> = Cell::new(0));
impl Monster {
fn new(health: u8) -> Monster {
MONSTER_ID.with(|thread_id| {
let id = thread_id.get();
thread_id.set(id + 1);
Monster { id, health }
})
}
}
fn main() {
let gnome = Monster::new(41);
let troll = Monster::new(42);
println!("gnome {:?}", gnome);
println!("troll {:?}", troll);
}
If you do want something that works better with multiple threads, check out bluss' answer, which shows how to use an atomic variable.
This is something of a controversial topic, so let me start by explaining my use case, and then talk about the actual problem.
I find that for a bunch of unsafe things, it's important to make sure that you don't leak memory; this is actually quite easy to do if you start using transmute() and forget(). For example, passing a boxed instance to C code for an arbitrary amount of time, then fetching it back out and 'resurrecting it' by using transmute.
Imagine I have a safe wrapper for this sort of API:
trait Foo {}
struct CBox;
impl CBox {
/// Stores value in a bound C api, forget(value)
fn set<T: Foo>(value: T) {
// ...
}
/// Periodically call this and maybe get a callback invoked
fn poll(_: Box<Fn<(EventType, Foo), ()> + Send>) {
// ...
}
}
impl Drop for CBox {
fn drop(&mut self) {
// Safely load all saved Foo's here and discard them, preventing memory leaks
}
}
To test this is actually not leaking any memory, I want some tests like this:
#[cfg(test)]
mod test {
struct IsFoo;
impl Foo for IsFoo {}
impl Drop for IsFoo {
fn drop(&mut self) {
Static::touch();
}
}
#[test]
fn test_drops_actually_work() {
guard = Static::lock(); // Prevent any other use of Static concurrently
Static::reset(); // Set to zero
{
let c = CBox;
c.set(IsFoo);
c.set(IsFoo);
c.poll(/*...*/);
}
assert!(Static::get() == 2); // Assert that all expected drops were invoked
guard.release();
}
}
How can you create this type of static singleton object?
It must use a Semaphore style guard lock to ensure that multiple tests do not concurrently run, and then unsafely access some kind of static mutable value.
I thought perhaps this implementation would work, but practically speaking it fails because occasionally race conditions result in a duplicate execution of init:
/// Global instance
static mut INSTANCE_LOCK: bool = false;
static mut INSTANCE: *mut StaticUtils = 0 as *mut StaticUtils;
static mut WRITE_LOCK: *mut Semaphore = 0 as *mut Semaphore;
static mut LOCK: *mut Semaphore = 0 as *mut Semaphore;
/// Generate instances if they don't exist
unsafe fn init() {
if !INSTANCE_LOCK {
INSTANCE_LOCK = true;
INSTANCE = transmute(box StaticUtils::new());
WRITE_LOCK = transmute(box Semaphore::new(1));
LOCK = transmute(box Semaphore::new(1));
}
}
Note specifically that unlike a normal program where you can be certain that your entry point (main) is always running in a single task, the test runner in Rust does not offer any kind of single entry point like this.
Other, obviously, than specifying the maximum number of tasks; given dozens of tests, only a handful need to do this sort of thing, and it's slow and pointless to limit the test task pool to one just for this one case.
It looks like a use case for std::sync::Once:
use std::sync::{Once, ONCE_INIT};
static INIT: Once = ONCE_INIT;
Then in your tests call
INIT.doit(|| unsafe { init(); });
Once guarantees that your init will only be executed once, no matter how many times you call INIT.doit().
See also lazy_static, which makes things a little more ergonomic. It does essentially the same thing as a static Once for each variable, but wraps it in a type that implements Deref so that you can access it like a normal reference.
Usage looks like this (from the documentation):
#[macro_use]
extern crate lazy_static;
use std::collections::HashMap;
lazy_static! {
static ref HASHMAP: HashMap<u32, &'static str> = {
let mut m = HashMap::new();
m.insert(0, "foo");
m.insert(1, "bar");
m.insert(2, "baz");
m
};
static ref COUNT: usize = HASHMAP.len();
static ref NUMBER: u32 = times_two(21);
}
fn times_two(n: u32) -> u32 { n * 2 }
fn main() {
println!("The map has {} entries.", *COUNT);
println!("The entry for `0` is \"{}\".", HASHMAP.get(&0).unwrap());
println!("A expensive calculation on a static results in: {}.", *NUMBER);
}
Note that autoderef means that you don't even have to use * whenever you call a method on your static variable. The variable will be initialized the first time it's Deref'd.
However, lazy_static variables are immutable (since they're behind a reference). If you want a mutable static, you'll need to use a Mutex:
lazy_static! {
static ref VALUE: Mutex<u64>;
}
impl Drop for IsFoo {
fn drop(&mut self) {
let mut value = VALUE.lock().unwrap();
*value += 1;
}
}
#[test]
fn test_drops_actually_work() {
// Have to drop the mutex guard to unlock, so we put it in its own scope
{
*VALUE.lock().unwrap() = 0;
}
{
let c = CBox;
c.set(IsFoo);
c.set(IsFoo);
c.poll(/*...*/);
}
assert!(*VALUE.lock().unwrap() == 2); // Assert that all expected drops were invoked
}
If you're willing to use nightly Rust you can use SyncLazy instead of the external lazy_static crate:
#![feature(once_cell)]
use std::collections::HashMap;
use std::lazy::SyncLazy;
static HASHMAP: SyncLazy<HashMap<i32, String>> = SyncLazy::new(|| {
println!("initializing");
let mut m = HashMap::new();
m.insert(13, "Spica".to_string());
m.insert(74, "Hoyten".to_string());
m
});
fn main() {
println!("ready");
std::thread::spawn(|| {
println!("{:?}", HASHMAP.get(&13));
}).join().unwrap();
println!("{:?}", HASHMAP.get(&74));
// Prints:
// ready
// initializing
// Some("Spica")
// Some("Hoyten")
}