How to write a wrapper enum for C error codes?

How to write a wrapper enum for C error codes? - rust

I am writing a Rust wrapper for a C API. It contains a function that may fail, in which case it returns an error code encoded as an int. Let's call these SOME_ERROR and OTHER_ERROR, and they will have the values 1 and 2, respectively. I want to write an enum wrapping these error codes, as follows:
// Declared in a seperate C header
const SOME_ERROR: c_int = 1;
const OTHER_ERROR: c_int = 2;
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
#[repr(i32)]
enum ErrorCodeWrapper {
SomeError = SOME_ERROR,
OtherError = OTHER_ERROR,
}
Here comes my first question. It does not seem to be possible to specify std::os::raw::c_int as the underlying type of an enum. But I do feel like it should be, as int isn't required to be 32 bits wide. Is there any way to achieve this?
I'd then like some methods to convert to and from a raw error code:
use std::os::raw::c_int;
impl ErrorCodeWrapper {
fn from_raw(raw: c_int) -> Option<Self> {
match raw {
SOME_ERROR => Some(Self::SomeError),
OTHER_ERROR => Some(Self::OtherError),
_ => None
}
}
unsafe fn from_raw_unchecked(raw: c_int) -> Self {
*(&raw as *const _ as *const Self)
}
fn as_raw(self) -> c_int {
unsafe { *(&self as *const _ as *const c_int) }
}
}
The only way I could find to "bit-cast" c_int to and from ErrorCodeWrapper is to do it C-style, by casting a pointer and then dereferencing it. This should work as ErrorCodeWrapper and int have the same size and alignment, and the value of every ErrorCodeWrapper variant maps to its corresponding error code. However, this solution is a bit to hackery for my taste; is there a more idiomatic one, like C++'s std::bit_cast?
Furthermore, is it possible to replace the match statement in ErrorCodeWrapper::from_raw with a simple validity check, for simpler code in the case of more variants?
The last bit of code, the necessary error implementations:
use std::{fmt::Display, error::Error};
impl Display for ErrorCodeWrapper {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "{}", match self {
Self::SomeError => "some error",
Self::OtherError => "some other error",
})
}
}
impl Error for ErrorCodeWrapper {}
Now let's imagine a second wrapper, SuperErrorCodeWrapper, that includes some or all of the variants of ErrorCodeWrapper, with the same description and everything. That would mean that either:
One could "factor out" the common variants of ErrorCodeWrapper and SuperErrorCodeWrapper into a separate enum. ErrorCodeWrapper and SuperErrorCodeWrapper would then have a variant containing this enum. However I am not really fond of this kind of nesting, which would seem arbitrary when focusing on one particular error.
Duplicating the common variants across both enums.
The latter would add a lot to the existing boilerplate. Could a macro be a viable option to handle this?
Is there a library that could handle all this for me?

Here comes my first question. It does not seem to be possible to specify std::os::raw::c_int as the underlying type of an enum. But I do feel like it should be, as int isn't required to be 32 bits wide. Is there any way to achieve this?
No. There was an RFC in 2016 (I can't even access the RFC text. it seems it was removed), but it was closed:
We discussed in the #rust-lang/lang meeting and decided that while the RFC is well-motivated, it doesn't sufficiently address the various implementation complexities that must be overcome nor the interaction with hygiene. It would make sense to extend the attribute system to support more general paths before considering this RFC (but that is a non-trivial undertaking).
The best you can do is to use #[cfg_attr] with all configurations. c_int is defined here as, and all current options are
#[cfg_attr(any(target_arch = "avr", target_arch = "msp430"), repr(i16))]
#[cfg_attr(not(any(target_arch = "avr", target_arch = "msp430")), repr(i32))]
enum ErrorCodeWrapper { ... }
is there a more idiomatic one, like C++'s std::bit_cast?
Yes; std::mem::transmute().
One could "factor out" the common variants of ErrorCodeWrapper and SuperErrorCodeWrapper into a separate enum. ErrorCodeWrapper and SuperErrorCodeWrapper would then have a variant containing this enum.
If you do that, you lose the ability to transmute() (or pointer cast, it doesn't matter), as they'll be no longer layout compatible with int.
Could a macro be a viable option to handle this?
Probably yes.
Is there a library that could handle all this for me?
I don't know all library that handles all of this (although it is possible that one exists), but there is thiserror (and friends) for the Error and Display implementations, and strum::FromRepr that can help you with from_raw().

Related

How do I inspect function arguments at runtime in Rust?

Say I have a trait that looks like this:
use std::{error::Error, fmt::Debug};
use super::CheckResult;
/// A Checker is a component that is responsible for checking a
/// particular aspect of the node under investigation, be that metrics,
/// system information, API checks, load tests, etc.
#[async_trait::async_trait]
pub trait Checker: Debug + Sync + Send {
type Input: Debug;
/// This function is expected to take input, whatever that may be,
/// and return a vec of check results.
async fn check(&self, input: &Self::Input) -> anyhow::Result<Vec<CheckResult>>;
}
And say I have two implementations of this trait:
pub struct ApiData {
some_response: String,
}
pub MetricsData {
number_of_events: u64,
}
pub struct ApiChecker;
impl Checker for ApiChecker {
type Input = ApiData;
// implement check function
}
pub struct MetricsChecker;
impl Checker for MetricsChecker {
type Input = MetricsData;
// implement check function
}
In my code I have a Vec of these Checkers that looks like this:
pub struct MyServer {
checkers: Vec<Box<dyn Checker>>,
}
What I want to do is figure out, based on what Checkers are in this Vec, what data I need to fetch. For example, if it just contained an ApiChecker, I would only need to fetch the ApiData. If both ApiChecker and MetricsChecker were there, I'd need both ApiData and MetricsData. You can also imagine a third checker where Input = (ApiData, MetricsData). In that case I'd still just need to fetch ApiData and MetricsData once.
I imagine an approach where the Checker trait has an additional function on it that looks like this:
fn required_data(&self) -> HashSet<DataId>;
This could then return something like [DataId::Api, DataId::Metrics]. I would then run this for all Checkers in my vec and then I'd end up a complete list of data I need to get. I could then do some complicated set of checks like this:
let mut required_data = HashSet::new();
for checker in checkers {
required_data.union(&mut checker.required_data());
}
let api_data: Option<ApiData> = None;
if required_data.contains(DataId::Api) {
api_data = Some(get_api_data());
}
And so on for each of the data types.
I'd then pass them into the check calls like this:
api_checker.check(
api_data.expect("There was some logic error and we didn't get the API data even though a Checker declared that it needed it")
);
The reasons I want to fetch the data outside of the Checkers is:
To avoid fetching the same data multiple times.
To support memoization between unrelated calls where the arguments are the same (this could be done inside some kind of Fetcher trait implementation for example).
To support generic retry logic.
By now you can probably see that I've got two big problems:
The declaration of what data a specific Checker needs is duplicated, once in the function signature and again from the required_data function. This naturally introduces bug potential. Ideally this information would only be declared once.
Similarly, in the calling code, I have to trust that the data that the Checkers said they needed was actually accurate (the expect in the previous snippet). If it's not, and we didn't get data we needed, there will be problems.
I think both of these problems would be solved if the function signature, and specifically the Input associated type, was able to express this "required data" declaration on its own. Unfortunately I'm not sure how to do that. I see there is a nightly feature in any that implements Provider and Demand: https://doc.rust-lang.org/std/any/index.html#provider-and-demand. This sort of sounds like what I want, but I have to use stable Rust, plus I figure I must be missing something and there is an easier way to do this without going rogue with semi dynamic typing.
tl;dr: How can I inspect what types the arguments are for a function (keeping in mind that the input might be more complex than just one thing, such as a struct or tuple) at runtime from outside the trait implementer? Alternatively, is there a better way to design this code that would eliminate the need for this kind of reflection?

Your problems start way earlier than you mention:
checkers: Vec<Box<dyn Checker>>
This is an incomplete type. The associated type Input means that Checker<Input = ApiData> and Checker<Input = MetricsData> are incompatible. How would you call checkers[0].check(input)? What type would input be? If you want a collection of "checkers" then you'll need a unified API, where the arguments to .check() are all the same.
I would suggest a different route altogether: Instead of providing the input, provide a type that can retrieve the input that they ask for. That way there's no need to coordinate what type the checkers will ask for in a type-safe way, it'll be inherent to the methods the checkers themselves call. And if your primary concern is repeatedly retrieving the same data for different checkers, then all you need to do is implement caching in the provider. Same with retry logic.
Here's my suggestion:
struct DataProvider { /* cached api and metrics */ }
impl DataProvider {
fn fetch_api_data(&mut self) -> anyhow::Result<ApiData> { todo!() }
fn fetch_metrics_data(&mut self) -> anyhow::Result<MetricsData> { todo!() }
}
#[async_trait::async_trait]
trait Checker {
async fn check(&self, data: &mut DataProvider) -> anyhow::Result<Vec<CheckResult>>;
}
struct ApiAndMetricsChecker;
#[async_trait::async_trait]
impl Checker for ApiAndMetricsChecker {
async fn check(&self, data: &mut DataProvider) -> anyhow::Result<Vec<CheckResult>> {
let _api_data = data.fetch_api_data()?;
let _metrics_data = data.fetch_metrics_data()?;
// do something with api and metrics data
todo!()
}
}

Metaprogamming name to function and type lookup in Rust?

I am working on a system which produces and consumes large numbers of "events", they are a name with some small payload of data, and an attached function which is used as a kind of fold-left over the data, something like a reducer.
I receive from the upstream something like {t: 'fieldUpdated', p: {new: 'new field value'}}, and must in my program associate the fieldUpdated "callback" function with the incoming event and apply it. There is a confirmation command I must echo back (which follows a programatic naming convention), and each type is custome.
I tried using simple macros to do codegen for the structs, callbacks, and with the paste::paste! macro crate, and with the stringify macro I made quite good progress.
Regrettably however I did not find a good way to metaprogram these into a list or map using macros. Extending an enum through macros doesn't seem to be possible, and solutions such as the use of ctors seems extremely hacky.
My ideal case is something this:
type evPayload = {
new: String
}
let evHandler = fn(evPayload: )-> Result<(), Error> { Ok(()) }
// ...
let data = r#"{"t": 'fieldUpdated', "p": {"new": 'new field value'}}"#'
let v: Value = serde_json::from_str(data)?;
Given only knowledge of data how can use macros, specifically (boilerplate is actually 2-3 types, 3 functions, some factory and helper functions) in a way that I can do a name-to-function lookup?
It seems like Serde's adjacently, or internally tagged would get me there, if I could modify a enum in a macro https://serde.rs/enum-representations.html#internally-tagged
It almost feels like I need a macro which can either maintain an enum, or I can "cheat" and use module scoped ctors to do a quasi-static initialization of the names and types into a map.
My program would have on the order of 40-100 of these, with anything from 3-10 in a module. I don't think ctors are necessarily a problem here, but the fact that they're a little grey area handshake, and that ctors might preclude one day being able to cross-compile to wasm put me off a little.

I actually had need of something similar today; the enum macro part specifically. But beware of my method: here be dragons!
Someone more experienced than me — and less mad — should probably vet this. Please do not assume my SAFETY comments to be correct.
Also, if you don't have variant that collide with rust keywords, you might want to tear out the '_' prefix hack entirely. I used a static mut byte array for that purpose, as manipulating strings was an order of magnitude slower, but that was benchmarked in a simplified function. There are likely better ways of doing this.
Finally, I am using it where failing to parse must cause panic, so error handling somewhat limited.
With that being said, here's my current solution:
/// NOTE: It is **imperative** that the length of this array is longer that the longest variant name +1
static mut CHECK_BUFF: [u8; 32] = [b'_'; 32];
macro_rules! str_enums {
($enum:ident: $($variant:ident),* $(,)?) => {
#[allow(non_camel_case_types)]
#[derive(Debug, Default, Hash, Clone, PartialEq, Eq, PartialOrd, Ord)]
enum $enum {
#[default]
UNINIT,
$($variant),*,
UNKNOWN
}
impl FromStr for $enum {
type Err = String;
fn from_str(s: &str) -> Result<Self, Self::Err> {
unsafe {
// SAFETY: Currently only single threaded
CHECK_BUFF[1..len].copy_from_slice(s.as_bytes());
let len = s.len() + 1;
assert!(CHECK_BUFF.len() >= len);
// SAFETY: Safe as long as CHECK_BUFF.len() >= s.len() + 1
match from_utf8_unchecked(&CHECK_BUFF[..len]) {
$(stringify!($variant) => Ok(Self::$variant),)*
_ => Err(format!(
"{} variant not accounted for: {s} ({},)",
stringify!($enum),
from_utf8_unchecked(&CHECK_BUFF[..len])
))
}
}
}
}
impl From<&$enum> for &'static str {
fn from(variant: &$enum) -> Self {
unsafe {
match variant {
// SAFETY: The first byte is always '_', and stripping it of should be safe.
$($enum::$variant => from_utf8_unchecked(&stringify!($variant).as_bytes()[1..]),)*
$enum::UNINIT => {
eprintln!("uninitialized {}!", stringify!($enum));
""
}
$enum::UNKNOWN => {
eprintln!("unknown {}!", stringify!($enum));
""
}
}
}
}
}
impl Display for $enum {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "{}", Into::<&str>::into(self))
}
}
};
}
And then I call it like so:
str_enums!(
AttributeKind:
_alias,
_allowduplicate,
_altlen,
_api,
...
_enum,
_type,
_struct,
);
str_enums!(
MarkupKind:
_alias,
_apientry,
_command,
_commands,
...
);

Type handling in the trait object

I'm new to rust and I recently ran into a problem with trait
I have a trait that is used as the source of a message and is stored in a structure as a Box trait object. I simplified my logic and the code looks something like this.
#[derive(Debug)]
enum Message {
MessageTypeA(i32),
MessageTypeB(f32),
}
enum Config {
ConfigTypeA,
ConfigTypeB,
}
trait Source {
fn next(&mut self) -> Message;
}
struct SourceA;
impl Source for SourceA {
fn next(&mut self) -> Message {
Message::MessageTypeA(1)
}
}
struct SourceB;
impl Source for SourceB {
fn next(&mut self) -> Message {
Message::MessageTypeB(1.1)
}
}
struct Test {
source: Box<dyn Source>,
}
impl Test {
fn new(config: Config) -> Self {
Test {
source: match config {
Config::ConfigTypeA => Box::new(SourceA{}),
Config::ConfigTypeB => Box::new(SourceB{}),
}
}
}
fn do_sth(&mut self) -> String {
match self.source.next() {
Message::MessageTypeA(a) => format!("a is {:?}", a),
Message::MessageTypeB(b) => format!("b is {:?}", b),
}
}
fn do_sth_else(&mut self, message: Message) -> String {
match message {
Message::MessageTypeA(a) => format!("a is {:?}", a),
Message::MessageTypeB(b) => format!("b is {:?}", b),
}
}
}
Different types of Source return different types of Message, the Test structure needs to create the corresponding trait object according to config and call next() in the do_sth function.
So you can see two enum types Config and Message, which I feel is a strange usage, but I don't know what's strange about it.
I tried to use trait association type, but then I need to specify the association type when I declare the Test structure like source: Box<dyn Source<Item=xxxx>> but I don't actually know the exact type when creating the struct object.
Then I tried to use Generic type, but because of the need of the upper code, Test cannot use Generic.
So please help me, is there a more elegant or rustic solution to this situation?

So what might be going on here is that you're mis-using the idea of trait objects. The whole idea of a trait object is that you want to write code that relies purely on its interface. As soon as you find yourself checking for the underlying type of a trait object, that should raise a red flag.
Of course in your example you're not using any sort of weird run-time type checking; instead, you're checking the type implicitly via the enums.
Still, you recognize this as problematic. First, it becomes very clumsy when you try to add another variant to your sources, because now you have to go and add that to your message and config enums as well. Second, you then have to add that handling logic to everywhere the trait object is used. And finally, the type system "lies" a bit. It seems to me that source A will only ever send messages of the first variant and source B will only ever send messages of the second variant, and that's how we're telling them apart.
So what's a way out here?
First, trait objects should be designed such that they can be used without having to know which implementation we're dealing with. Traits represent roles that structs can play, and code that uses trait objects says "I'm happy to work with anyone who can play that role".
If your code isn't happy to work with anyone who can play that trait's role, there's a flaw in the design.
It would be good to know how you are, in general, processing the messages returned by the sources.
For example, does it matter for the rest of your program that source A only ever returns integers and source B only ever returns floats? And if so, what about it is it that matters, and could that be abstracted behind a trait?

Is it possible to assign the return value of different functions that return different structs (that implement a common trait) to a single variable?

Let's say I have something like this:
trait SomeTrait {}
struct One;
impl SomeTrait for One {}
struct Two;
impl SomeTrait for Two {}
fn return_one() -> One {
One
}
fn return_two() -> Two {
Two
}
Somewhere else, I want to essentially do:
fn do_stuff(op: Operation) {
let result = match op {
Operation::OpOne => return_one(),
Operation::OpTwo => return_two(),
};
}
That of course doesn't compile, as those two return_*() functions return distinct types. I've tried:
Declaring result as dyn SomeTrait (still error: mismatched types)
Casting the return values, e.g. return_one() as dyn SomeTrait (error: cast to unsized type: One as dyn SomeTrait)
Making Sized a supertrait of SomeTrait (this won't work for me in this particular case as I don't have control over the real-world version of SomeTrait, but it doesn't compile anyway: error: the trait SomeTrait cannot be made into an object
Things I think would work but don't want to or can't do:
Boxing values on return, e.g. Box::new(return_one()) as dyn Box<SomeTrait> (having to move the values into a box, and thus off the stack, seems excessive)
Having return_one() and return_two() instead return impl SomeTrait (this would allow me to accidentally return Two from return_one(), for example, and I want to use the type system to prevent that)
Wrapping with an enum: I don't want the functions to return a wrapper enum, because then we have the same problem as the previous bullet point. I could wrap the return values in an enum at the call site, and that could work, but let's say there's some function on SomeTrait that I want to call at the end; it seems like a lot of extra boilerplate to then unwrap the enum and call that function for each inner type. If I were to do that, I might as well just copy-paste the trait function call to each match arm.
I found a few crates on crates.io that claim to be able to do this, but AFAICT they all require implementing a trait on the types, which are foreign types for me, so I can't do that.
Is it possible to make this work?

A possible option is to do the following
fn do_stuff(op: Operation) {
let (one, two);
let _result: &dyn SomeTrait = match op {
Operation::OpOne => {one = return_one(); &one},
Operation::OpTwo => {two = return_two(); &two},
};
}
You can also use &mut dyn SomeTrait instead if you need to borrow it mutably.
This is somewhat verbose, but if you find yourself doing it a lot, a macro
that declares the anonymous variables, assigns them and returns a reference might help.
Another solution could be to use the auto_enums crate, which automaticaly creates the enum and implements the trait for it, the downside is that it only supports certain traits, (mostly in std, I believe) and that for this specific use case it requires nightly, or putting the match in a separate function.
I'm not sure I can link a specific part of the docs, but if you scroll down to "#Rust Nightly", you'll see your specific use of it, something like as follows
use auto_enums::auto_enum;
fn do_stuff(op: Operation) {
#[auto_enum(SomeTrait)]
let _result = match op {
Operation::OpOne => return_one(),
Operation::OpTwo => return_two(),
};
}
Although keep in mind this only works if auto_enums supports SomeTrait.

Is it possible to tell if a field is a certain type or implements a certain method in a procedural macro?

I created a procedural macro that implements a trait, but in order for this to work I need to get the raw bytes for every field. The problem is how to get the bytes of a field differs depending on the type of field.
Is there some way of testing if a function exists on a field and if it does not tries another function?
E.g. something like this:
if item.field::function_exist {
//do code
} else {
//do other code
}
Currently I am looking at creating another trait/member function that I just have to create for all primitives and create a procedural macro for larger fields such as structs. For example:
if item.field::as_bytes().exists {
(&self.#index).as_bytes()
} else {
let bytes = (&self.#index).to_bytes();
&bytes
}
With a string, it has a as_bytes member function, while i32 does not. This means I need extra code, when the member field of the struct is not a string. I might need a match rather than an if, but the if will suffice for the example.

Is it possible to tell if a field is a certain type or implements a certain method in a procedural macro?
No, it is not.
Macros operate on the abstract syntax tree (AST) of the Rust code. This means that you basically just get the characters that the user typed in.
If user code has something like type Foo = Option<Result<i32, MyError>>, and you process some code that uses Foo, the macro will not know that it's "really" an Option.
Even if it did know the type, knowing what methods are available would be even harder. Future crates can create traits which add methods to existing types. At the point in time that the procedural macro is running, these crates may not have even been compiled yet.
I am looking at creating another trait/member function that I just have to create for all primitives and create a procedural macro for larger fields such as structs.
This is the correct solution. If you look at any existing well-used procedural macro, that's exactly what it does. This allows the compiler to do what the compiler is intended to do.
This is also way better for maintainability — now these primitive implementations live in a standard Rust file, as opposed to embedded inside of a macro. Much easier to read and debug.
Your crate will have something like this:
// No real design put into this trait
trait ToBytes {
fn encode(&self, buf: &mut Vec<u8>);
}
impl ToBytes for str {
fn encode(&self, buf: &mut Vec<u8>) {
buf.extend(self.as_bytes())
}
}
// Other base implementations
And your procedural macro will implement this in the straightforward way:
#[derive(ToBytes)]
struct Foo {
a: A,
b: B,
}
becomes
impl ToBytes for Foo {
fn encode(&self, buf: &mut Vec<u8>) {
ToBytes::encode(&self.a, buf);
ToBytes::encode(&self.b, buf);
}
}
As a concrete example, Serde does the same thing, with multiple ways of serializing to and from binary data:
Bincode
CBOR
MessagePack
etc.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string