Storing a module as a variable in rust

Storing a module as a variable in rust - rust

I have 2 different modules with precisely the same implementation, same functions, types, etc. They just do different things. I would like to be able to choose one of these modules at runtime and use it exclusively. Furthermore, there are several of these modules that may or may not exist at compile-time based on platform, features, etc. this is a link to a super stripped-down version of what I want. I am trying to choose between the various gfx-hal backends. The best I have been able to come up with is a macro that creates an if statement for each possible module then fires that if statement whenever a function in a module is run. However, this doesn't really seem elegant or at all good. So is there a way to store the modules in a variable and access it, or some way to do this that mimics that?
Thanks in advance

You could do this by turning each of your modules into its own trait implementation, similar to how gfx-rs does things.
Your "trait" would in actuality never be implemented with state, and instead be a collection of associated items like functions, other types, etc.
You could package it up like so:
#![allow(dead_code)]
mod foo {
pub fn print() { println!("hello from foo") }
}
mod bar {
pub fn print() { println!("hello from bar"); }
}
mod zam { // this may not exist depending on the platform, one will always exist
pub fn print() { println!("hello from zam"); }
}
struct FOO;
struct BAR;
struct ZAM;
trait RuntimeModule {
fn print();
}
impl RuntimeModule for FOO {
fn print() { foo::print(); }
}
impl RuntimeModule for BAR {
fn print() { bar::print(); }
}
impl RuntimeModule for ZAM {
fn print() { zam::print(); }
}
fn main() {
// Here we decide which to use
print_module::<FOO>();
}
// This is our "entrypoint"
fn print_module<T: RuntimeModule>() {
T::print();
}
If we decide which to use at runtime (in this case in main), we can then call a generic function which will use the associated types/functions to make decisions.
Note that you would not be able to use Box<dyn RuntimeModule> if RuntimeModule contained associated types that were different for each implementation.

Related

Zero cost builder pattern for recursive data structure using transmute. Is this safe? Is there a better approach?

I would like to create a struct using the builder pattern which must be validated before construction, and I would like to minimize the construction overhead.
I've come up with a nice way to do that using std::mem::transmute, but I'm far from confident that this approach is really safe, or that it's the best approach.
Here's my code: (Rust Playground)
#[derive(Debug)]
pub struct ValidStruct {
items: Vec<ValidStruct>
}
#[derive(Debug)]
pub struct Builder {
pub items: Vec<Builder>
}
#[derive(Debug)]
pub struct InvalidStructError {}
impl Builder {
pub fn new() -> Self {
Self { items: vec![] }
}
pub fn is_valid(&self) -> bool {
self.items.len() % 2 == 1
}
pub fn build(self) -> Result<ValidStruct, InvalidStructError> {
if !self.is_valid() {
return Err(InvalidStructError {});
}
unsafe {
Ok(std::mem::transmute::<Builder, ValidStruct>(self))
}
}
}
fn main() {
let mut builder = Builder::new();
builder.items.push(Builder::new());
let my_struct = builder.build().unwrap();
println!("{:?}", my_struct)
}
So, this seems to work. I think it should be safe because I know the two structs are identical. Am I missing anything? Could this actually cause problems somehow, or is there a cleaner/better approach available?

You can't normally transmute between different structures just because they seem to have the same fields in the same order, because the compiler might change that. You can avoid the risk by forcing the memory layout but you're then fighting the compiler and preventing optimizations. This approach isn't usually recommended and is, in my opinion, not needed here.
What you want is to have
a recursive data structure with public fields so that you can easily build it
an identical structure, built from the first one but with no public access and only built after validation of the first one
And you want to avoid useless copies for performance reasons.
What I suggest is to have a wrapper class. This makes sense because wrapping a struct in another one is totally costless in Rust.
You could thus have
/// This is the "Builder" struct
pub struct Data {
pub items: Vec<Data>,
}
pub struct ValidStruct {
data: Data, // no public access here
}
impl Data {
pub fn build(self) -> Result<ValidStruct, InvalidStructError> {
if !self.is_valid() {
return Err(InvalidStructError {});
}
Ok(Self{ data })
}
}
(alternatively, you could declare a struct Builder as a wrapper of Data too but with a public access to its field)

How can I determine if I have a unique Arc when it's dropped?

I've an Arc<Mutex<Thing>> field in a struct which is cloned many times. It is shared between concurrent threads. Drop::drop is called for each clone as it goes out of scope. Is there any way to determine when Drop::drop is called for the last (unique) Arc<Mutex<Thing>>?
It's clear that strong_count is subject to data races (I've seen them). So, you can't count on Arc::strong_count() == 1 (no pun intended).
I found that I couldn't use Arc::try_unwrap() due to a move issue.
Arc::is_unique() is private.
Other than keeping a Arc<AtomicUsize> field, which is incremented on clone and decremented on drop, is there any way to determine if a drop is for a unique Arc<Mutex<Thing>>?
Here's an MRE:
use std::sync::{Arc};
#[derive(Debug)]
enum Action {
One, Two, Three
}
// Thing trait which operates on an Action, which should be a enum, allowing for
// different action sets.
trait Thing<T> {
fn disconnected(&self);
fn action(&self, action: T);
}
// There are many instances of an ActionController.
// There may be zero or more clones of an instance.
// The final drop of the instances should call thing.disconnected()
// In a multi-core environment, the same instance may be running on multiple cores
// ActionController should not be generic.
#[derive(Clone)]
struct ActionController {
id: usize,
thing: Arc<dyn Thing<Action>>,
}
impl ActionController {
fn new(id: usize, thing: Box<dyn Thing<Action>>) -> Self {
Self { id, thing: Arc::from(thing) }
}
fn invoke(&self, action: Action) {
self.thing.action(action);
}
}
//
// To work around the drop issue, I've implemented Clone for ActionController which
// performs a fetch_add(1) on clone and a fetch_sub(1) on drop. This provides
// suficient information to call disconnected() -- but it just seems like there's
// got to be a better way.
impl Drop for ActionController {
fn drop(&mut self) {
// drop will be called for each clone of an Controller instance. When
// the unique instance is dropped, disconnected() must be called
self.thing.disconnected();
}
}
struct Controlled {}
impl Thing<Action> for Controlled {
fn disconnected(&self) { println!("disconnected")}
fn action(&self, action: Action) {println!("action: {:#?}", action)}
}
fn bad() {
let controlled = Controlled{};
let controlled = Box::new(controlled) as Box<dyn Thing<Action>>;
let controller = ActionController::new(1, controlled);
let clone = controller.clone();
controller.invoke(Action::One);
clone.invoke(Action::Two);
drop (controller);
clone.invoke(Action::Three);
}
fn main() {
bad();
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn incorrect() {
bad();
}
}

Arc::try_unwrap is probably the intended way to do this - is it possible to restructure your code to avoid the move issues you were running into?

Why do you want to know? If you have some extra cleanup code that needs to be executed before the Mutex<Thing> is dropped, maybe you could use an Arc<MyLockedThing> instead, where MyLockedThing is a struct containing a Mutex<Thing> that impls Drop to do the cleanup?

It seems like you want to be notified when the data inside the Arc is to be dropped. If so, this can be done by implementing Drop on the type "inside" the Arc.
Define a newtype:
struct ThingAction(Box<dyn Thing<Action>>);
impl Thing<Action> for ThingAction {
fn disconnected(&self) {
self.0.disconnected()
}
fn action(&self, action: Action) {
self.0.action(action)
}
}
And implement Drop:
impl Drop for ThingAction {
fn drop(&mut self) {
self.disconnected()
}
}
Then use the newtype:
#[derive(Clone)]
struct ActionController {
id: usize,
thing: Arc<ThingAction>,
}
impl ActionController {
fn new(id: usize, thing: Box<dyn Thing<Action>>) -> Self {
Self { id, thing: Arc::new(ThingAction(thing)) }
}

I don't think there's any perfect way to do this without stdlib support (go checkout out Arc::drop).
Weak::strong_count or Weak::upgrade is less subject to races so if you downgrade your Arc then drop it, if the weakref's strong count is 0 or trying to upgrade it fails you know the Arc is dead, but there is no guarantee the current thread killed it, two might have concurrently dropped the Arc at the same time before either had the time to check for the weakref's strong count.
I think the only bulletproof way would be to get notified by a Drop stored inside the Arc, that you're guaranteed is only called once.

How do I get a function pointer from a trait in Rust?

How do I get over something like this:
struct Test {
foo: Option<fn()>
}
impl Test {
fn new(&mut self) {
self.foo = Option::Some(self.a);
}
fn a(&self) { /* can use Test */ }
}
I get this error:
error: attempted to take value of method `a` on type `&mut Test`
--> src/main.rs:7:36
|
7 | self.foo = Option::Some(self.a);
| ^
|
= help: maybe a `()` to call it is missing? If not, try an anonymous function
How do I pass a function pointer from a trait? Similar to what would happen in this case:
impl Test {
fn new(&mut self) {
self.foo = Option::Some(a);
}
}
fn a() { /* can't use Test */ }

What you're trying to do here is get a function pointer from a (to use Python terminology here, since Rust doesn't have a word for this) bound method. You can't.
Firstly, because Rust doesn't have a concept of "bound" methods; that is, you can't refer to a method with the invocant (the thing on the left of the .) already bound in place. If you want to construct a callable which approximates this, you'd use a closure; i.e. || self.a().
However, this still wouldn't work because closures aren't function pointers. There is no "base type" for callable things like in some other languages. Function pointers are a single, specific kind of callable; closures are completely different. Instead, there are traits which (when implemented) make a type callable. They are Fn, FnMut, and FnOnce. Because they are traits, you can't use them as types, and must instead use them from behind some layer of indirection, such as Box<FnOnce()> or &mut FnMut(i32) -> String.
Now, you could change Test to store an Option<Box<Fn()>> instead, but that still wouldn't help. That's because of the other, other problem: you're trying to store a reference to the struct inside of itself. This is not going to work well. If you manage to do this, you effectively render the Test value permanently unusable. More likely is that the compiler just won't let you get that far.
Aside: you can do it, but not without resorting to reference counting and dynamic borrow checking, which is out of scope here.
So the answer to your question as-asked is: you don't.
Let's change the question: instead of trying to crowbar a self-referential closure in, we can instead store a callable that doesn't attempt to capture the invocant at all.
struct Test {
foo: Option<Box<Fn(&Test)>>,
}
impl Test {
fn new() -> Test {
Test {
foo: Option::Some(Box::new(Self::a)),
}
}
fn a(&self) { /* can use Test */ }
fn invoke(&self) {
if let Some(f) = self.foo.as_ref() {
f(self);
}
}
}
fn main() {
let t = Test::new();
t.invoke();
}
The callable being stored is now a function that takes the invocant explicitly, side-stepping the issues with cyclic references. We can use this to store Test::a directly, by referring to it as a free function. Also note that because Test is the implementation type, I can also refer to it as Self.
Aside: I've also corrected your Test::new function. Rust doesn't have constructors, just functions that return values like any other.
If you're confident you will never want to store a closure in foo, you can replace Box<Fn(&Test)> with fn(&Test) instead. This limits you to function pointers, but avoids the extra allocation.
If you haven't already, I strongly urge you to read the Rust Book.

There are few mistakes with your code. new function (by the convention) should not take self reference, since it is expected to create Self type.
But the real issue is, Test::foo expecting a function type fn(), but Test::a's type is fn(&Test) == fn a(&self) if you change the type of foo to fn(&Test) it will work. Also you need to use function name with the trait name instead of self. Instead of assigning to self.a you should assign Test::a.
Here is the working version:
extern crate chrono;
struct Test {
foo: Option<fn(&Test)>
}
impl Test {
fn new() -> Test {
Test {
foo: Some(Test::a)
}
}
fn a(&self) {
println!("a run!");
}
}
fn main() {
let test = Test::new();
test.foo.unwrap()(&test);
}
Also if you gonna assign a field in new() function, and the value must always set, then there is no need to use Option instead it can be like that:
extern crate chrono;
struct Test {
foo: fn(&Test)
}
impl Test {
fn new() -> Test {
Test {
foo: Test::a
}
}
fn a(&self) {
println!("a run!");
}
}
fn main() {
let test = Test::new();
(test.foo)(&test); // Make sure the paranthesis are there
}

Convenient 'Option<Box<Any>>' access when success is assured?

When writing callbacks for generic interfaces, it can be useful for them to define their own local data which they are responsible for creating and accessing.
In C I would just use a void pointer, C-like example:
struct SomeTool {
int type;
void *custom_data;
};
void invoke(SomeTool *tool) {
StructOnlyForThisTool *data = malloc(sizeof(*data));
/* ... fill in the data ... */
tool.custom_data = custom_data;
}
void execute(SomeTool *tool) {
StructOnlyForThisTool *data = tool.custom_data;
if (data.foo_bar) { /* do something */ }
}
When writing something similar in Rust, replacing void * with Option<Box<Any>>, however I'm finding that accessing the data is unreasonably verbose, eg:
struct SomeTool {
type: i32,
custom_data: Option<Box<Any>>,
};
fn invoke(tool: &mut SomeTool) {
let data = StructOnlyForThisTool { /* my custom data */ }
/* ... fill in the data ... */
tool.custom_data = Some(Box::new(custom_data));
}
fn execute(tool: &mut SomeTool) {
let data = tool.custom_data.as_ref().unwrap().downcast_ref::<StructOnlyForThisTool>().unwrap();
if data.foo_bar { /* do something */ }
}
There is one line here which I'd like to be able to write in a more compact way:
tool.custom_data.as_ref().unwrap().downcast_ref::<StructOnlyForThisTool>().unwrap()
tool.custom_data.as_ref().unwrap().downcast_mut::<StructOnlyForThisTool>().unwrap()
While each method makes sense on its own, in practice it's not something I'd want to write throughout a code-base, and not something I'm going to want to type out often or remember easily.
By convention, the uses of unwrap here aren't dangerous because:
While only some tools define custom data, the ones that do always define it.
When the data is set, by convention the tool only ever sets its own data. So there is no chance of having the wrong data.
Any time these conventions aren't followed, its a bug and should panic.
Given these conventions, and assuming accessing custom-data from a tool is something that's done often - what would be a good way to simplify this expression?
Some possible options:
Remove the Option, just use Box<Any> with Box::new(()) representing None so access can be simplified a little.
Use a macro or function to hide verbosity - passing in the Option<Box<Any>>: will work of course, but prefer not - would use as a last resort.
Add a trait to Option<Box<Any>> which exposes a method such as tool.custom_data.unwrap_box::<StructOnlyForThisTool>() with matching unwrap_box_mut.
Update 1): since asking this question a point I didn't include seems relevant.
There may be multiple callback functions like execute which must all be able to access the custom_data. At the time I didn't think this was important to point out.
Update 2): Wrapping this in a function which takes tool isn't practical, since the borrow checker then prevents further access to members of tool until the cast variable goes out of scope, I found the only reliable way to do this was to write a macro.

If the implementation really only has a single method with a name like execute, that is a strong indication to consider using a closure to capture the implementation data. SomeTool can incorporate an arbitrary callable in a type-erased manner using a boxed FnMut, as shown in this answer. execute() then boils down to invoking the closure stored in the struct field implementation closure using (self.impl_)(). For a more general approach, that will also work when you have more methods on the implementation, read on.
An idiomatic and type-safe equivalent of the type+dataptr C pattern is to store the implementation type and pointer to data together as a trait object. The SomeTool struct can contain a single field, a boxed SomeToolImpl trait object, where the trait specifies tool-specific methods such as execute. This has the following characteristics:
You no longer need an explicit type field because the run-time type information is incorporated in the trait object.
Each tool's implementation of the trait methods can access its own data in a type-safe manner without casts or unwraps. This is because the trait object's vtable automatically invokes the correct function for the correct trait implementation, and it is a compile-time error to try to invoke a different one.
The "fat pointer" representation of the trait object has the same performance characteristics as the type+dataptr pair - for example, the size of SomeTool will be two pointers, and accessing the implementation data will still involve a single pointer dereference.
Here is an example implementation:
struct SomeTool {
impl_: Box<SomeToolImpl>,
}
impl SomeTool {
fn execute(&mut self) {
self.impl_.execute();
}
}
trait SomeToolImpl {
fn execute(&mut self);
}
struct SpecificTool1 {
foo_bar: bool
}
impl SpecificTool1 {
pub fn new(foo_bar: bool) -> SomeTool {
let my_data = SpecificTool1 { foo_bar: foo_bar };
SomeTool { impl_: Box::new(my_data) }
}
}
impl SomeToolImpl for SpecificTool1 {
fn execute(&mut self) {
println!("I am {}", self.foo_bar);
}
}
struct SpecificTool2 {
num: u64
}
impl SpecificTool2 {
pub fn new(num: u64) -> SomeTool {
let my_data = SpecificTool2 { num: num };
SomeTool { impl_: Box::new(my_data) }
}
}
impl SomeToolImpl for SpecificTool2 {
fn execute(&mut self) {
println!("I am {}", self.num);
}
}
pub fn main() {
let mut tool1: SomeTool = SpecificTool1::new(true);
let mut tool2: SomeTool = SpecificTool2::new(42);
tool1.execute();
tool2.execute();
}
Note that, in this design, it doesn't make sense to make implementation an Option because we always associate the tool type with the implementation. While it is perfectly valid to have an implementation without data, it must always have a type associated with it.

Recommended way to wrap C lib initialization/destruction routine

I am writing a wrapper/FFI for a C library that requires a global initialization call in the main thread as well as one for destruction.
Here is how I am currently handling it:
struct App;
impl App {
fn init() -> Self {
unsafe { ffi::InitializeMyCLib(); }
App
}
}
impl Drop for App {
fn drop(&mut self) {
unsafe { ffi::DestroyMyCLib(); }
}
}
which can be used like:
fn main() {
let _init_ = App::init();
// ...
}
This works fine, but it feels like a hack, tying these calls to the lifetime of an unnecessary struct. Having the destructor in a finally (Java) or at_exit (Ruby) block seems theoretically more appropriate.
Is there some more graceful way to do this in Rust?
EDIT
Would it be possible/safe to use this setup like so (using the lazy_static crate), instead of my second block above:
lazy_static! {
static ref APP: App = App::new();
}
Would this reference be guaranteed to be initialized before any other code and destroyed on exit? Is it bad practice to use lazy_static in a library?
This would also make it easier to facilitate access to the FFI through this one struct, since I wouldn't have to bother passing around the reference to the instantiated struct (called _init_ in my original example).
This would also make it safer in some ways, since I could make the App struct default constructor private.

I know of no way of enforcing that a method be called in the main thread beyond strongly-worded documentation. So, ignoring that requirement... :-)
Generally, I'd use std::sync::Once, which seems basically designed for this case:
A synchronization primitive which can be used to run a one-time global
initialization. Useful for one-time initialization for FFI or related
functionality. This type can only be constructed with the ONCE_INIT
value.
Note that there's no provision for any cleanup; many times you just have to leak whatever the library has done. Usually if a library has a dedicated cleanup path, it has also been structured to store all that initialized data in a type that is then passed into subsequent functions as some kind of context or environment. This would map nicely to Rust types.
Warning
Your current code is not as protective as you hope it is. Since your App is an empty struct, an end-user can construct it without calling your method:
let _init_ = App;
We will use a zero-sized argument to prevent this. See also What's the Rust idiom to define a field pointing to a C opaque pointer? for the proper way to construct opaque types for FFI.
Altogether, I'd use something like this:
use std::sync::Once;
mod ffi {
extern "C" {
pub fn InitializeMyCLib();
pub fn CoolMethod(arg: u8);
}
}
static C_LIB_INITIALIZED: Once = Once::new();
#[derive(Copy, Clone)]
struct TheLibrary(());
impl TheLibrary {
fn new() -> Self {
C_LIB_INITIALIZED.call_once(|| unsafe {
ffi::InitializeMyCLib();
});
TheLibrary(())
}
fn cool_method(&self, arg: u8) {
unsafe { ffi::CoolMethod(arg) }
}
}
fn main() {
let lib = TheLibrary::new();
lib.cool_method(42);
}

I did some digging around to see how other FFI libs handle this situation. Here is what I am currently using (similar to #Shepmaster's answer and based loosely on the initialization routine of curl-rust):
fn initialize() {
static INIT: Once = ONCE_INIT;
INIT.call_once(|| unsafe {
ffi::InitializeMyCLib();
assert_eq!(libc::atexit(cleanup), 0);
});
extern fn cleanup() {
unsafe { ffi::DestroyMyCLib(); }
}
}
I then call this function inside the public constructors for my public structs.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Storing a module as a variable in rust - rust

Related

Zero cost builder pattern for recursive data structure using transmute. Is this safe? Is there a better approach?

How can I determine if I have a unique Arc when it's dropped?

How do I get a function pointer from a trait in Rust?

Convenient 'Option<Box<Any>>' access when success is assured?

Recommended way to wrap C lib initialization/destruction routine

Categories

Resources