How to `use` function scoped structs? - rust

Consider the following contrived example:
mod Parent {
fn my_fn() {
struct MyStruct;
mod Inner {
use super::MyStruct; //Error: unresolved import `super::MyStruct`. No `MyStruct` in `Parent`
}
}
}
How can I import MyStruct here from the inner module?
Motivation
While the above is code that you'll never write manually, it is code that is useful to generate. A real-world use-case would be a derive-macro. Let's say I want this:
#[derive(Derivative)]
struct MyStruct;
Now it's useful to isolate the generated code in its own module, to isolate the generated code from the source code (e.g. to avoid naming collisions, leaking of use declarations, etc.). So I want the generated code to be something like this:
mod _Derivative_MyStruct {
use super::MyStruct;
impl Derivative for MyStruct { }
}
However, the example above fails if the struct is defined in a function, due to the problem at the top. e.g. this won't work:
fn my_fn() {
#[derive(Derivative)];
struct MyStruct;
}
as it expands into:
fn my_fn() {
#[derive(Derivative)];
struct MyStruct;
mod _Derivative_MyStruct {
use super::MyStruct; // error
impl Derivative for MyStruct {}
}
}
This is especially troublesome for doctests, as these are implicitly wrapped in a function. E.g. this will give the unresolved import problem:
/// Some interesting documentation
/// ```
/// #[derive(Derivative)]
/// struct MyStruct;
/// ```
Without the ability to refer to the outer scope, I either need to give up isolation, or require wrapping in modules at the call site. I'd like to avoid this.

This is issue #79260. I don't think there is a solution.
However, you can define the nested items inside an unnamed const (const _: () = { /* code */ };) instead of a module. This prevents name collisions and is the idiomatic thing to do in macros that need to define names. Do note however that this does not have a way to refer to items inside the const from outside it.

Related

Can I import super:: from a mod that's inside a fn?

I realise that this is an extremely odd thing to do but I'm writing a macro I'd like to be usable in as many places as possible. Take the following code:
mod outer {
struct OuterTestStruct {}
fn do_thing() {
struct InnerTestStruct {}
mod inner {
use super::OuterTestStruct;
use super::InnerTestStruct;
}
}
}
This doesn't compile because of the use super::InnerTestStruct line. But the use super::OuterTestStruct line works fine, so my assumption here is that super skips over the fn context and goes straight to the parent mod.
Is there any way I can get a reference to InnerTestStruct from inside mod inner? Especially without knowing any context beforehand (i.e. imagine a macro invocation inside fn do_thing(), it isn't going to know it's inside a fn)
Is there any way I can get a reference to InnerTestStruct from inside mod inner?
No, super will refer to the encompassing module, not the function scope. There is no path that can name InnerTestStruct as far as I'm aware.
Since you mention macros specifically, the Rust API Guidelines warns against this exact case:
Item macros work anywhere that items are allowed
Rust allows items to be placed at the module level or within a tighter scope like a function. Item macros should work equally well as ordinary items in all of these places. The test suite should include invocations of the macro in at least the module scope and function scope.
As a simple example of how things can go wrong, this macro works great in a module scope but fails in a function scope.
macro_rules! broken {
($m:ident :: $t:ident) => {
pub struct $t;
pub mod $m {
pub use super::$t;
}
} }
broken!(m::T); // okay, expands to T and m::T
fn g() {
broken!(m::U); // fails to compile, super::U refers to the containing module not g
}
The only fix I know is to introduce another module:
mod outer {
struct OuterTestStruct {}
fn do_thing() {
mod middle { // <----------------
struct InnerTestStruct {}
mod inner {
use super::super::OuterTestStruct;
use super::InnerTestStruct;
}
}
}
}

Storing a module as a variable in rust

I have 2 different modules with precisely the same implementation, same functions, types, etc. They just do different things. I would like to be able to choose one of these modules at runtime and use it exclusively. Furthermore, there are several of these modules that may or may not exist at compile-time based on platform, features, etc. this is a link to a super stripped-down version of what I want. I am trying to choose between the various gfx-hal backends. The best I have been able to come up with is a macro that creates an if statement for each possible module then fires that if statement whenever a function in a module is run. However, this doesn't really seem elegant or at all good. So is there a way to store the modules in a variable and access it, or some way to do this that mimics that?
Thanks in advance
You could do this by turning each of your modules into its own trait implementation, similar to how gfx-rs does things.
Your "trait" would in actuality never be implemented with state, and instead be a collection of associated items like functions, other types, etc.
You could package it up like so:
#![allow(dead_code)]
mod foo {
pub fn print() { println!("hello from foo") }
}
mod bar {
pub fn print() { println!("hello from bar"); }
}
mod zam { // this may not exist depending on the platform, one will always exist
pub fn print() { println!("hello from zam"); }
}
struct FOO;
struct BAR;
struct ZAM;
trait RuntimeModule {
fn print();
}
impl RuntimeModule for FOO {
fn print() { foo::print(); }
}
impl RuntimeModule for BAR {
fn print() { bar::print(); }
}
impl RuntimeModule for ZAM {
fn print() { zam::print(); }
}
fn main() {
// Here we decide which to use
print_module::<FOO>();
}
// This is our "entrypoint"
fn print_module<T: RuntimeModule>() {
T::print();
}
If we decide which to use at runtime (in this case in main), we can then call a generic function which will use the associated types/functions to make decisions.
Note that you would not be able to use Box<dyn RuntimeModule> if RuntimeModule contained associated types that were different for each implementation.

Convenient 'Option<Box<Any>>' access when success is assured?

When writing callbacks for generic interfaces, it can be useful for them to define their own local data which they are responsible for creating and accessing.
In C I would just use a void pointer, C-like example:
struct SomeTool {
int type;
void *custom_data;
};
void invoke(SomeTool *tool) {
StructOnlyForThisTool *data = malloc(sizeof(*data));
/* ... fill in the data ... */
tool.custom_data = custom_data;
}
void execute(SomeTool *tool) {
StructOnlyForThisTool *data = tool.custom_data;
if (data.foo_bar) { /* do something */ }
}
When writing something similar in Rust, replacing void * with Option<Box<Any>>, however I'm finding that accessing the data is unreasonably verbose, eg:
struct SomeTool {
type: i32,
custom_data: Option<Box<Any>>,
};
fn invoke(tool: &mut SomeTool) {
let data = StructOnlyForThisTool { /* my custom data */ }
/* ... fill in the data ... */
tool.custom_data = Some(Box::new(custom_data));
}
fn execute(tool: &mut SomeTool) {
let data = tool.custom_data.as_ref().unwrap().downcast_ref::<StructOnlyForThisTool>().unwrap();
if data.foo_bar { /* do something */ }
}
There is one line here which I'd like to be able to write in a more compact way:
tool.custom_data.as_ref().unwrap().downcast_ref::<StructOnlyForThisTool>().unwrap()
tool.custom_data.as_ref().unwrap().downcast_mut::<StructOnlyForThisTool>().unwrap()
While each method makes sense on its own, in practice it's not something I'd want to write throughout a code-base, and not something I'm going to want to type out often or remember easily.
By convention, the uses of unwrap here aren't dangerous because:
While only some tools define custom data, the ones that do always define it.
When the data is set, by convention the tool only ever sets its own data. So there is no chance of having the wrong data.
Any time these conventions aren't followed, its a bug and should panic.
Given these conventions, and assuming accessing custom-data from a tool is something that's done often - what would be a good way to simplify this expression?
Some possible options:
Remove the Option, just use Box<Any> with Box::new(()) representing None so access can be simplified a little.
Use a macro or function to hide verbosity - passing in the Option<Box<Any>>: will work of course, but prefer not - would use as a last resort.
Add a trait to Option<Box<Any>> which exposes a method such as tool.custom_data.unwrap_box::<StructOnlyForThisTool>() with matching unwrap_box_mut.
Update 1): since asking this question a point I didn't include seems relevant.
There may be multiple callback functions like execute which must all be able to access the custom_data. At the time I didn't think this was important to point out.
Update 2): Wrapping this in a function which takes tool isn't practical, since the borrow checker then prevents further access to members of tool until the cast variable goes out of scope, I found the only reliable way to do this was to write a macro.
If the implementation really only has a single method with a name like execute, that is a strong indication to consider using a closure to capture the implementation data. SomeTool can incorporate an arbitrary callable in a type-erased manner using a boxed FnMut, as shown in this answer. execute() then boils down to invoking the closure stored in the struct field implementation closure using (self.impl_)(). For a more general approach, that will also work when you have more methods on the implementation, read on.
An idiomatic and type-safe equivalent of the type+dataptr C pattern is to store the implementation type and pointer to data together as a trait object. The SomeTool struct can contain a single field, a boxed SomeToolImpl trait object, where the trait specifies tool-specific methods such as execute. This has the following characteristics:
You no longer need an explicit type field because the run-time type information is incorporated in the trait object.
Each tool's implementation of the trait methods can access its own data in a type-safe manner without casts or unwraps. This is because the trait object's vtable automatically invokes the correct function for the correct trait implementation, and it is a compile-time error to try to invoke a different one.
The "fat pointer" representation of the trait object has the same performance characteristics as the type+dataptr pair - for example, the size of SomeTool will be two pointers, and accessing the implementation data will still involve a single pointer dereference.
Here is an example implementation:
struct SomeTool {
impl_: Box<SomeToolImpl>,
}
impl SomeTool {
fn execute(&mut self) {
self.impl_.execute();
}
}
trait SomeToolImpl {
fn execute(&mut self);
}
struct SpecificTool1 {
foo_bar: bool
}
impl SpecificTool1 {
pub fn new(foo_bar: bool) -> SomeTool {
let my_data = SpecificTool1 { foo_bar: foo_bar };
SomeTool { impl_: Box::new(my_data) }
}
}
impl SomeToolImpl for SpecificTool1 {
fn execute(&mut self) {
println!("I am {}", self.foo_bar);
}
}
struct SpecificTool2 {
num: u64
}
impl SpecificTool2 {
pub fn new(num: u64) -> SomeTool {
let my_data = SpecificTool2 { num: num };
SomeTool { impl_: Box::new(my_data) }
}
}
impl SomeToolImpl for SpecificTool2 {
fn execute(&mut self) {
println!("I am {}", self.num);
}
}
pub fn main() {
let mut tool1: SomeTool = SpecificTool1::new(true);
let mut tool2: SomeTool = SpecificTool2::new(42);
tool1.execute();
tool2.execute();
}
Note that, in this design, it doesn't make sense to make implementation an Option because we always associate the tool type with the implementation. While it is perfectly valid to have an implementation without data, it must always have a type associated with it.

Recommended way to wrap C lib initialization/destruction routine

I am writing a wrapper/FFI for a C library that requires a global initialization call in the main thread as well as one for destruction.
Here is how I am currently handling it:
struct App;
impl App {
fn init() -> Self {
unsafe { ffi::InitializeMyCLib(); }
App
}
}
impl Drop for App {
fn drop(&mut self) {
unsafe { ffi::DestroyMyCLib(); }
}
}
which can be used like:
fn main() {
let _init_ = App::init();
// ...
}
This works fine, but it feels like a hack, tying these calls to the lifetime of an unnecessary struct. Having the destructor in a finally (Java) or at_exit (Ruby) block seems theoretically more appropriate.
Is there some more graceful way to do this in Rust?
EDIT
Would it be possible/safe to use this setup like so (using the lazy_static crate), instead of my second block above:
lazy_static! {
static ref APP: App = App::new();
}
Would this reference be guaranteed to be initialized before any other code and destroyed on exit? Is it bad practice to use lazy_static in a library?
This would also make it easier to facilitate access to the FFI through this one struct, since I wouldn't have to bother passing around the reference to the instantiated struct (called _init_ in my original example).
This would also make it safer in some ways, since I could make the App struct default constructor private.
I know of no way of enforcing that a method be called in the main thread beyond strongly-worded documentation. So, ignoring that requirement... :-)
Generally, I'd use std::sync::Once, which seems basically designed for this case:
A synchronization primitive which can be used to run a one-time global
initialization. Useful for one-time initialization for FFI or related
functionality. This type can only be constructed with the ONCE_INIT
value.
Note that there's no provision for any cleanup; many times you just have to leak whatever the library has done. Usually if a library has a dedicated cleanup path, it has also been structured to store all that initialized data in a type that is then passed into subsequent functions as some kind of context or environment. This would map nicely to Rust types.
Warning
Your current code is not as protective as you hope it is. Since your App is an empty struct, an end-user can construct it without calling your method:
let _init_ = App;
We will use a zero-sized argument to prevent this. See also What's the Rust idiom to define a field pointing to a C opaque pointer? for the proper way to construct opaque types for FFI.
Altogether, I'd use something like this:
use std::sync::Once;
mod ffi {
extern "C" {
pub fn InitializeMyCLib();
pub fn CoolMethod(arg: u8);
}
}
static C_LIB_INITIALIZED: Once = Once::new();
#[derive(Copy, Clone)]
struct TheLibrary(());
impl TheLibrary {
fn new() -> Self {
C_LIB_INITIALIZED.call_once(|| unsafe {
ffi::InitializeMyCLib();
});
TheLibrary(())
}
fn cool_method(&self, arg: u8) {
unsafe { ffi::CoolMethod(arg) }
}
}
fn main() {
let lib = TheLibrary::new();
lib.cool_method(42);
}
I did some digging around to see how other FFI libs handle this situation. Here is what I am currently using (similar to #Shepmaster's answer and based loosely on the initialization routine of curl-rust):
fn initialize() {
static INIT: Once = ONCE_INIT;
INIT.call_once(|| unsafe {
ffi::InitializeMyCLib();
assert_eq!(libc::atexit(cleanup), 0);
});
extern fn cleanup() {
unsafe { ffi::DestroyMyCLib(); }
}
}
I then call this function inside the public constructors for my public structs.

Use of undeclared type that is defined in another file

So, I have a hierarchy of files that have their own classes. Here is an example:
mod query;
struct Row<T>{
data: Vec<Query<T>>,
}
impl<T> Row<T>{
fn new(array: Vec<Query<T>>) -> Row<T>{
Row{
data: array,
}
}
}
Although it says the files are there, it says that "Query is an undeclared type," even when it exists in another file. The code works when everything is in the same file.
This is documented in the Rust book, specifically the section on modules. When you have different modules, you need to bring items from the other modules into scope using the use keyword.
mod query {
pub struct Query;
}
// Bring Query into scope
use query::Query;
struct Row(Vec<Query>);
fn main() {}

Resources