Strange struct definition from crate nix - struct

I just encountered this weird struct definition, it is in fcntl.rs from the crate nix.
pub struct OFlag: c_int {
/// Mask for the access mode of the file.
O_ACCMODE;
// other fields are omitted.
}
A normal struct in my perspective will be something like this:
struct Person{
name: String,
age: u8,
}
So, here are my doubts:
what is OFlag: c_int?
c_int is an type alias of i32. pub type c_int = i32;
Why don't its fields have any type annotation?
My surmise is that OFlag is of type c_int, and the fields are something similar to enum's fields.(compliant to the open syscall function signature int open(const char *pathname, int flags, mode_t mode) ) But this is just my guesswork, an explanation citing rust official doc would be appreciated.

The code you quoted is not valid Rust code on its own. It's code that gets passed to an internal macro of the nix crate called libc_bitflags!(). This macro takes the quoted code as input and transforms it into valid Rust code.
The libc_bitflags!() macro is a simple wrapper around the bitflags!() macro from the bitflags crate. The wrapper simplifies creating bitflags structs that take all their values from constants defined in the libc crate. For example this invocation
libc_bitflags!{
pub struct ProtFlags: libc::c_int {
PROT_NONE;
PROT_READ;
PROT_WRITE;
PROT_EXEC;
}
}
gets expanded to
bitflags!{
pub struct ProtFlags: libc::c_int {
const PROT_NONE = libc::PROT_NONE;
const PROT_READ = libc::PROT_READ;
const PROT_WRITE = libc::PROT_WRITE;
const PROT_EXEC = libc::PROT_EXEC;
}
}
which in turn will be expanded to Rust code by the bitflags!() macro. The libc::c_int type is used as the type of the bits field of the resulting struct. The constants inside will become associated constants of the resulting struct. See the documentation of the bitflags crate for further details.

If you look at the file you can see that the code inside the macro libc_bitflags!. The definition of the macro is here. There you can see that the macro ::bitflags::bitflags! is called and that libc_bitflags almost redirects the full input to bitflags. You can read more about that crate here.
Now to your questions:
OFlag will be after macro expansion a struct with a single attribute which is of type c_int:
pub struct OFlag {
bits: c_int,
}
The fields don't need a type because they won't exist anymore in the expanded code (after the macro was run). The bits are of the "type of the struct" so in your case c_int. The fields will be converted to associated constants:
impl OFlag {
pub const O_ACCMODE = Self { bits: libc::O_ACCMODE };
}
You can create an expansion of an example in the playground (Tools -> Expand macros)

Related

Writing to a field in a MaybeUninit structure?

I'm doing something with MaybeUninit and FFI in Rust that seems to work, but I suspect may be unsound/relying on undefined behavior.
My aim is to have a struct MoreA extend a struct A, by including A as an initial field. And then to call some C code that writes to the struct A. And then finalize MoreA by filling in its additional fields, based on what's in A.
In my application, the additional fields of MoreA are all integers, so I don't have to worry about assignments to them dropping the (uninitialized) previous values.
Here's a minimal example:
use core::fmt::Debug;
use std::mem::MaybeUninit;
#[derive(Clone, Copy, PartialEq, Debug)]
#[repr(C)]
struct A(i32, i32);
#[derive(Clone, Copy, PartialEq, Debug)]
#[repr(C)]
struct MoreA {
head: A,
more: i32,
}
unsafe fn mock_ffi(p: *mut A) {
// write doesn't drop previous (uninitialized) occupant of p
p.write(A(1, 2));
}
fn main() {
let mut b = MaybeUninit::<MoreA>::uninit();
unsafe { mock_ffi(b.as_mut_ptr().cast()); }
let b = unsafe {
let mut b = b.assume_init();
b.more = 3;
b
};
assert_eq!(&b, &MoreA { head: A(1, 2), more: 3 });
}
Is the code let b = unsafe { ... } sound? It runs Ok and Miri doesn't complain.
But the MaybeUninit docs say:
Moreover, uninitialized memory is special in that the compiler knows that it does not have
a fixed value. This makes it undefined behavior to have uninitialized data in a variable
even if that variable has an integer type, which otherwise can hold any fixed bit pattern.
Also, the Rust book says that Behavior considered undefined includes:
Producing an invalid value, even in private fields and locals. "Producing" a value happens any time a value is assigned to or read from a place, passed to a function/primitive operation or returned from a function/primitive operation. The following values are invalid (at their respective type):
... An integer (i*/u*) or ... obtained from uninitialized memory.
On the other hand, it doesn't seem possible to write to the more field before calling assume_init. Later on the same page:
There is currently no supported way to create a raw pointer or reference to a field of a struct
inside MaybeUninit. That means it is not possible to create a struct by calling
MaybeUninit::uninit::() and then writing to its fields.
If what I'm doing in the above code example does trigger undefined behavior, what would solutions be?
I'd like to avoid boxing the A value (that is, I'd like to have it be directly included in MoreA).
I'd hope also to avoid having to create one A to pass to mock_ffi and then having to copy the results into MoreA. A in my real application is a large structure.
I guess if there's no sound way to get what I'm after, though, I'd have to choose one of those two fallbacks.
If struct A is of a type that can hold the bit-pattern 0 as a valid value, then I guess a third fallback would be:
Start with MaybeUninit::zeroed() rather than MaybeUninit::uninit().
Currently, the only sound way to refer to uninitialized memory—of any type—is MaybeUninit. In practice, it is probably safe to read or write to uninitialized integers, but that is not officially documented. It is definitely not safe to read or write to an uninitialized bool or most other types.
In general, as the documentation states, you cannot initialize a struct field by field. However, it is sound to do so as long as:
the struct has repr(C). This is necessary because it prevents Rust from doing clever layout tricks, so that the layout of a field of type MaybeUninit<T> remains identical to the layout of a field of type T, regardless of its adjacent fields.
every field is MaybeUninit. This lets us assume_init() for the entire struct, and then later initialise each field individually.
Given that your struct is already repr(C), you can use an intermediate representation which uses MaybeIninit for every field. The repr(C) also means that we can transmute between the types once it is initialised, provided that the two structs have the same fields in the same order.
use std::mem::{self, MaybeUninit};
#[repr(C)]
struct MoreAConstruct {
head: MaybeUninit<A>,
more: MaybeUninit<i32>,
}
let b: MoreA = unsafe {
// It's OK to assume a struct is initialized when all of its fields are MaybeUninit
let mut b_construct = MaybeUninit::<MoreAConstruct>::uninit().assume_init();
mock_ffi(b_construct.head.as_mut_ptr());
b_construct.more = MaybeUninit::new(3);
mem::transmute(b_construct)
};
It is now possible (since Rust 1.51) to initialize fields of any uninitialized struct using the std::ptr::addr_of_mut macro. This example is from the documentation:
You can use MaybeUninit, and the std::ptr::addr_of_mut macro, to
initialize structs field by field:
#[derive(Debug, PartialEq)] pub struct Foo {
name: String,
list: Vec<u8>, }
let foo = {
let mut uninit: MaybeUninit<Foo> = MaybeUninit::uninit();
let ptr = uninit.as_mut_ptr();
// Initializing the `name` field
unsafe { addr_of_mut!((*ptr).name).write("Bob".to_string()); }
// Initializing the `list` field
// If there is a panic here, then the `String` in the `name` field leaks.
unsafe { addr_of_mut!((*ptr).list).write(vec![0, 1, 2]); }
// All the fields are initialized, so we call `assume_init` to get an initialized Foo.
unsafe { uninit.assume_init() } };
assert_eq!(
foo,
Foo {
name: "Bob".to_string(),
list: vec![0, 1, 2]
}
);

Extracting type T from Option<T>

I have some code generated by bindgen that has function pointers represented by Option<T>.
pub type SomeFnPointerType = ::std::option::Option<
unsafe extern "C" fn ( /* .. large number of argument types .. */)
-> /* return type */
> ;
I want to store the unwrapped values in a structure. However, there are no type aliases for the inner function pointer types, so how can I define that structure? I cannot refactor the code because it's automatically generated.
I want to do something like this (in C++'ish pseudo code):
struct FunctionTable {
decltype((None as SomeFnPointerType).unwrap()) unwrappedFn;
/* ... */
};
or maybe just SomeFnPointerType::T if that is allowed.
Is it possible to achieve that in Rust? If not, the only way I see it is to manually define those types by copy-pasting code from the generated file into a separate handwritten file and keep the types in sync manually.
You can define a trait that exposes the T as an associated type.
trait OptionExt {
type Type;
}
impl<T> OptionExt for Option<T> {
type Type = T;
}
type MyOption = Option<fn()>;
fn foo(f: <MyOption as OptionExt>::Type) {
f();
}
fn main() {
foo(|| {});
}

How do I expose a compile time generated static C string through FFI?

I am trying to embed a version number into a library. Ideally, this should be a static C string that can be read and doesn't need any additional allocation for reading the version number.
On the Rust side, I am using vergen to generate the versioning information like this:
pub static VERSION: &str = env!("VERGEN_SEMVER");
and I would like to end up with something like
#[no_mangle]
pub static VERSION_C: *const u8 = ... ;
There seems to be a way to achieve this using string literals, but I haven't found a way to do this with compile time strings. Creating a new CString seems to be beyond the current capabilities of static variables and tends to end with an error E0015.
A function returning the pointer like this would be acceptable, as long as it does not allocate new memory.
#[no_mangle]
pub extern "C" fn get_version() -> *const u8 {
// ...
}
The final type of the variable (or return type of the function) doesn't have to be based on u8, but should be translatable through cbindgen. If some other FFI type is more appropriate, using that is perfectly fine.
By ensuring that the static string slice is compatible with a C-style string (as in, it ends with the null terminator byte \0), we can safely fetch a pointer to the beginning of the slice and pass that across the boundary.
pub static VERSION: &str = concat!(env!("VERGEN_SEMVER"), "\0");
#[no_mangle]
pub extern "C" fn get_version() -> *const c_char {
VER.as_ptr() as *const c_char
}
Here's an example in the Playground, where I used the package's version as the environment variable to fetch and called the function in Rust.

How to write a custom derive macro?

I'm trying to write my own derive mode macro in Rust, and the documentation on it is somewhat lacking in examples.
I have a struct like:
#[derive(MyMacroHere)]
struct Example {
id: i64,
value: Option<String>,
}
I want my macro to generate a method à la
fn set_fields(&mut self, id: i64, value: Option<String>) {
// ...
}
What are the basic steps to use the TokenStream trait to achieve something like that?
Create a crate for your procedural macros:
cargo new my_derive --lib
Edit the Cargo.toml to make it a procedural macro crate:
[lib]
proc-macro = true
Implement your procedural macro:
extern crate proc_macro;
use proc_macro::TokenStream;
#[proc_macro_derive(MyMacroHere)]
pub fn my_macro_here_derive(input: TokenStream) -> TokenStream {
// ...
}
Import the procedural macro and use it:
extern crate my_derive;
use my_derive::MyMacroHere;
#[derive(MyMacroHere)]
struct Example {
id: i64,
value: Option<String>,
}
The hard part is in implementation of the macro. Most people use the syn and quote crates to parse the incoming Rust code and then generate new code.
For example, syn's documentation starts with an example of a custom derive. You will parse the struct (or enum or union) and then handle the various ways of defining a struct (unit, tuple, named fields). You'll collect the information you need (type, maybe name), then you'll generate the appropriate code.
See also:
How to Write a Custom derive Macro
Documentation for proc_macro
Documentation for syn
Documentation for quote
Is it possible to add your own derivable traits, or are these fixed by the compiler?

How to define a function with two enum parameters where the 2nd type is determined by the 1st?

I'm working on the rust-efl project, which is a binding to the Enlightenment EFL C library.
There is a function that takes two C-style enums to configure the library. I want to restrict the second argument to a certain enum based on the value of the first argument.
I'll use the actual code for this example since it makes it a bit easier to understand the use case.
Example call:
elementary::policy_set(ElmPolicy::Quit, ElmPolicyQuit::LastWindowClosed as i32);
Library Rust code:
#[repr(C)]
pub enum ElmPolicy {
/// under which circumstances the application should quit automatically. See ElmPolicyQuit
Quit = 0,
/// defines elm_exit() behaviour. See ElmPolicyExit
Exit,
/// defines how throttling should work. See ElmPolicyThrottle
Throttle
}
#[repr(C)]
pub enum ElmPolicyQuit {
/// never quit the application automatically
None = 0,
/// quit when the application's last window is closed
LastWindowClosed,
/// quit when the application's last window is hidden
LastWindowHidden
}
pub fn policy_set(policy: ElmPolicy, value: i32) {
There is an enum similar to ElmPolicyQuit for each value of ElmPolicy. The second argument to policy_set should be of the corresponding enum type.
I would like to modify policy_set so that the caller does not have to cast the value to an i32 by defining a type for value. Ideally, I would like Rust to check that the second argument is of the correct type for the given policy argument.
Implementation Attempt
I'm new to Rust, so this may be way off, but this is my current attempt:
pub fn policy_set<P: ElmPolicy, V: ElmPolicyValue<P>>(policy: P, value: V) {
unsafe { elm_policy_set(policy as c_int, value as c_int) }
}
trait ElmPolicyValue<P> {}
impl ElmPolicyValue<ElmPolicy::Quit> for ElmPolicyQuit {}
But I get this error:
src/elementary.rs:246:22: 246:31 error: ElmPolicy is not a trait
[E0404] src/elementary.rs:246 pub fn policy_set>(policy: P, value: V) {
(arrow points to first argument's type)
I made a dummy trait for ElmPolicy, but then I get
src/elementary.rs:111:21: 111:36 error: found value
elementary::ElmPolicy::Quit used as a type [E0248]
src/elementary.rs:111 impl ElmPolicyValue for
ElmPolicyQuit {}
So it seems like I can't use enums for generic types in this way.
What is the correct way to implement this?
I suppose I don't need these to actually be enums. I just need the values to be convertible to a c_int that corresponds with the EFL library's C enum.
Also, I thought about a single argument instead.
elementary::policy_set(elementary::Policy::Quit::LastWindowClosed);
But nested C-like enums don't seem to work despite documentation I found and I'm uncertain about using #[repr(C)] with nested enums.
Here's how I would do it: instead of defining ElmPolicy as an enum, I'd define it as a trait with an associated type that specifies the parameter type for the policy's value. Each policy would define a unit struct that implements ElmPolicy.
pub trait ElmPolicy {
type Value;
}
pub struct ElmPolicyQuit;
impl ElmPolicy for ElmPolicyQuit {
type Value = ElmPolicyQuitValue;
}
// TODO: ElmPolicyExit and ElmPolicyThrottle
#[repr(C)]
pub enum ElmPolicyQuitValue {
/// never quit the application automatically
None = 0,
/// quit when the application's last window is closed
LastWindowClosed,
/// quit when the application's last window is hidden
LastWindowHidden
}
pub fn policy_set<P: ElmPolicy>(policy: P, value: P::Value) {
// TODO
}
fn main() {
policy_set(ElmPolicyQuit, ElmPolicyQuitValue::LastWindowClosed);
//policy_set(ElmPolicyQuit, ElmPolicyQuitValue::LastWindowClosed as i32); // does not compile
}
You can add methods or supertraits to ElmPolicy and contraints to ElmPolicy::Value as necessary. For example, you might want to add Into<i32> as a constraint on ElmPolicy::Value, so that you can use value.into() in policy_set to convert the value to an i32.
I wouldn't try to solve both problems in one swoop. Create a straight-forward Rust binding to the C library that exposes the API as is (a *-sys package). Then provide a more idiomatic Rust API on top:
enum ElmPolicy {
Quit(QuitDetail),
Exit(ExitDetail),
Throttle(ThrottleDetail),
}
enum QuitDetail {
None,
LastWindowClosed,
LastWindowHidden,
}
Then you can create methods on ElmPolicy that produce a tuple of the C enums, or directly calls the appropriate C function, as desired.
Note that there is likely to be a bit of boilerplate as your enums in the *-sys crate will mirror those in the idiomatic API and you will want to convert between them. Macros may help reduce that.

Resources