I want to use a macro to generate
a) a struct with a variable number of members
b) Beside the struct also Default should be generated (and default value defined)
c) And a function with one or more parameters
The result should look like this:
struct Something {
x: i32,
y: i32,
}
impl Default for Something {
fn default() -> Self {
Something {
x: 0,
y: 0,
}
}
}
impl Something {
pub fn handle_event(
&mut self,
ev: AnEnum,
data_a: DataAType,
data_b: DataBType,
) -> usize {
0
}
}
Note:
handle_event always gets self and ev and optionally one or two additional types. Here DataAType and DataBType.
The struct itself should get 0 or more additional members like x or y as examples. Also the default values should be part of the macro.
I came up with this macro:
#[macro_export]
macro_rules! def_something{
($($data_a_t:ty)? $(,$data_b_t:ty)? $(,$element:ident: $ty:ty = $ex:expr)* ) => {
struct Something{
pub is_init: bool,
$($element: $ty),*
}
impl Default for Something{
fn default() -> Self{
Something {
is_init : false,
$($element: $ex),*
}
}
}
impl Something{
pub fn handle_event(&mut self, ev: AnEnum,
$(data_a:$data_a_t)? $(,data_b:$data_b_t)?
) -> usize {
0
}
}
};
}
Presently the macro can be used like this:
def_something!(DataAType, &mut crate::DataBType, x: i32=0, y: i32=0);
But fails with the following options:
def_something!(DataAType, DataBType, x: i32=0, y: i32=0);
def_something!(DataAType, x: i32=0, y: i32=0);
def_something!( x: i32=0, y: i32=0);`
The error message is always: local ambiguity when calling macro def_something: multiple parsing options: built-in NTs ident
How can I make my macro so that the various options to call it are accepted?
Is there a better way for the default value definition?
Related
I'm working with two crates: A and B. I control both. I'd like to create a struct in A that has a field whose type is known only to B (i.e., A is independent of B, but B is dependent on A).
crate_a:
#[derive(Clone)]
pub struct Thing {
pub foo: i32,
pub bar: *const i32,
}
impl Thing {
fn new(x: i32) -> Self {
Thing { foo: x, bar: &0 }
}
}
crate_b:
struct Value {};
fn func1() {
let mut x = A::Thing::new(1);
let y = Value {};
x.bar = &y as *const Value as *const i32;
...
}
fn func2() {
...
let y = unsafe { &*(x.bar as *const Value) };
...
}
This works, but it doesn't feel very "rusty". Is there a cleaner way to do this? I thought about using a trait object, but ran into issues with Clone.
Note: My reason for splitting these out is that the dependencies in B make compilation very slow. Value above is actually from llvm_sys. I'd rather not leak that into A, which has no other dependency on llvm.
The standard way to implement something like this is with generics, which are kind of like type variables: they can be "assigned" a particular type, possibly within some constraints. This is how the standard library can provide types like Vec that work with types that you declare in your crate.
Basically, generics allow Thing to be defined in terms of "some unknown type that will become known later when this type is actually used."
Given the example in your code, it looks like Thing's bar field may or may not be set, which suggests that the built-in Option enum should be used. All you have to do is put a type parameter on Thing and pass that through to Option, like so:
pub mod A {
#[derive(Clone)]
pub struct Thing<T> {
pub foo: i32,
pub bar: Option<T>,
}
impl<T> Thing<T> {
pub fn new(x: i32) -> Self {
Thing { foo: x, bar: None }
}
}
}
pub mod B {
use crate::A;
struct Value;
fn func1() {
let mut x = A::Thing::new(1);
let y = Value;
x.bar = Some(y);
// ...
}
fn func2(x: &A::Thing<Value>) {
// ...
let y: &Value = x.bar.as_ref().unwrap();
// ...
}
}
(Playground)
Here, the x in B::func1() has the type Thing<Value>. You can see with this syntax how Value is substituted for T, which makes the bar field Option<Value>.
If Thing's bar isn't actually supposed to be optional, just write pub bar: T instead, and accept a T in Thing::new() to initialize it:
pub mod A {
#[derive(Clone)]
pub struct Thing<T> {
pub foo: i32,
pub bar: T,
}
impl<T> Thing<T> {
pub fn new(x: i32, y: T) -> Self {
Thing { foo: x, bar: y }
}
}
}
pub mod B {
use crate::A;
struct Value;
fn func1() {
let mut x = A::Thing::new(1, Value);
// ...
}
fn func2(x: &A::Thing<Value>) {
// ...
let y: &Value = &x.bar;
// ...
}
}
(Playground)
Note that the definition of Thing in both of these cases doesn't actually require that T implement Clone; however, Thing<T> will only implement Clone if T also does. #[derive(Clone)] will generate an implementation like:
impl<T> Clone for Thing<T> where T: Clone { /* ... */ }
This can allow your type to be more flexible -- it can now be used in contexts that don't require T to implement Clone, while also being cloneable when T does implement Clone. You get the best of both worlds this way.
I'm trying to deserialize data in a simple non-human readable and non-self describing format to Rust structs. I've implemented a custom Deserializer for this format and it works great when I'm deserializing the data into a struct like this for example:
#[derive(Serialize, Deserialize)]
pub struct Position {
x: f32,
z: f32,
y: f32,
}
However, let's say this Position struct had a new field added (could have been removed too) in a new version:
#[derive(Serialize, Deserialize)]
pub struct Position {
x: f32,
z: f32,
y: f32,
is_visible: bool, // This field was added in a new version
}
But I still need to support both data from both versions of Position. The version of the data (known at runtime) can be given to the Deserializer but how can the Deserializer know the version of a field (known at compile time)?
I've looked at #[serde(deserialize_with)] but it didn't work because I cannot get the needed version information.
I 've also looked at implementing Deserialize manually for Position and I can receive the versions of the fields of Position by implementing something like Position::get_version(field_name: &str).
However, I cannot figure how to get the version of the data currently being deserialized because Deserialize::deserialize only has a trait bound Deserializer<'de> and I cannot make that bound stricter by adding another bound (so it doesn't know about my custom Deserializer).
At this point, I'm thinking about giving the version data of each field when instantiating the Deserializer but I'm not sure if that will work or if there is a better way to go.
Multiple structs implementing a shared trait
If you have several different versions with several different types of struct, and you want a more robust way of handling different variants, it might be a better idea to write structs for each possible format. You can then define and implement a trait for shared behavior.
trait Position {
fn x(&self) -> f32;
fn y(&self) -> f32;
fn z(&self) -> f32;
fn version_number(&self) -> usize;
}
struct PositionV0 {
x: f32,
y: f32,
z: f32
}
impl Position for PositionV0 {
fn x(&self) -> f32 {
self.x
}
// You get the idea for the fn y, fn z implementations
fn version_number(&self) -> usize {
0
}
}
struct PositionV1 {
x: f32,
y: f32,
z: f32,
is_visible: bool,
}
impl Position for PositionV1 {
fn x(&self) -> f32 {
self.x
}
// You get the idea for the fn y, fn z implementations
fn version_number(&self) -> usize {
1
}
}
Carson's answer is great when you do not have a lot of versions but for me I am working with data structures that range over 20 different versions.
I went with a solution that while I don't think is the most idiomatic, is capable of handling an arbitrary number of versions.
In short:
we implement a Version trait which gives the necessary version info to the Deserializer
Deserializer has VersionedSeqAccess (implements serde::de::SeqAccess) that sets a flag
When flag is set, we put None for that field and immediately unset the flag
The idea is to implement the following trait for the struct:
pub trait Version {
/// We must specify the name of the struct so that any of the fields that
/// are structs won't confuse the Deserializer
fn name() -> &'static str;
fn version() -> VersionInfo;
}
#[derive(Debug, Clone)]
pub enum VersionInfo {
/// Present in all versions
All,
/// Present in this version
Version([u16; 4]),
/// Represent Versions of structs
Struct(&'static [VersionInfo]),
// we can add other ways of expressing the version like a version range for ex.
}
Here is how it will be implemented for the example struct Position. This type of manual deriving is error prone so this can be improved with a derive macro (see end):
struct Position {
x: f32,
z: f32,
y: f32,
is_visible: Option<bool>, // With this solution versioned field must be wrapped in Option
}
impl Version for Position {
fn version() -> VersionInfo {
VersionInfo::Struct(&[
VersionInfo::All,
VersionInfo::All,
VersionInfo::All,
VersionInfo::Version([1, 13, 0, 0]),
])
}
fn name() -> &'static str {
"Position"
}
}
Now, the deserializer will be instansiated with the version of the data format we are currently parsing:
pub struct Deserializer<'de> {
input: &'de [u8],
/// The version the `Deserializer` expect the data format to be
de_version: [u16; 4],
/// Versions of each field. (only used when deserialzing to a struct)
version_info: VersionInfo,
/// Whether to skip deserialzing current item. This flag is set by `VersionedSeqAccess`.
/// When set, the current item is deserialized to `None`
skip: bool,
/// Name of struct we are deserialzing into. We use this to make sure we call the correct
/// visitor for children of this struct who are also structs
name: &'static str,
}
pub fn from_slice<'a, T>(input: &'a [u8], de_version: [u16; 4]) -> Result<T, Error>
where
T: Deserialize<'a> + Version,
{
let mut deserializer = Deserializer::from_slice(input, de_version, T::version(), T::name());
let t = T::deserialize(&mut deserializer)?;
Ok(t)
}
Now that the deserializer has the all the information it needs, this is how we define deserialize_struct:
fn deserialize_struct<V>(
self, name: &'static str, fields: &'static [&'static str], visitor: V,
) -> Result<V::Value, Self::Error>
where
V: Visitor<'de>,
{
if name == self.name {
if let VersionInfo::Struct(version_info) = self.version_info {
assert!(version_info.len() == fields.len()); // Make sure the caller implemented version info somewhat correctly. I use a derive macro to implement version so this is not a problem
visitor.visit_seq(VersionedSeqAccess::new(self, fields.len(), &version_info))
} else {
panic!("Struct must always have version info of `Struct` variant")
}
} else {
// This is for children structs of the main struct. We do not support versioning for those
visitor.visit_seq(SequenceAccess::new(self, fields.len()))
}
}
Here is how serde::de::SeqAccess will be implemented for VersionedSeqAccess:
struct VersionedSeqAccess<'a, 'de: 'a> {
de: &'a mut Deserializer<'de>,
version_info: &'static [VersionInfo],
len: usize,
curr: usize,
}
impl<'de, 'a> SeqAccess<'de> for VersionedSeqAccess<'a, 'de> {
type Error = Error;
fn next_element_seed<T>(&mut self, seed: T) -> Result<Option<T::Value>, Error>
where
T: DeserializeSeed<'de>,
{
if self.curr == self.len {
// We iterated through all fields
Ok(None)
} else {
// Get version of the current field
let version = &self.version_info[self.curr as usize];
self.de.version_info = version.clone();
// Set the flag if the version does not match
if !is_correct_version(&self.de.de_version, &version) {
self.de.skip = true;
}
self.curr += 1;
seed.deserialize(&mut *self.de).map(Some)
}
}
}
The final part of the puzzle is inside deserialize_option. If we are at a field not found in current data format the skip flag will be set here and we will produce None:
fn deserialize_option<V>(self, visitor: V) -> Result<V::Value, Self::Error>
where
V: Visitor<'de>,
{
if self.skip == true {
self.skip = false;
visitor.visit_none()
} else {
visitor.visit_some(self)
}
}
A lengthy solution but it works great for my usecase dealing with a lot of structs with lots of fields from different versions. Please do let me know how I can make this less verbose/better. I also implemented a derive macro (not shown here) for the Version trait to be able to do this:
#[derive(Debug, Clone, EventPrinter, Version)]
pub struct Position {
x: f32,
z: f32,
y: f32,
#[version([1, 13, 0, 0])]
is_visible: Option<bool>,
}
With this derive macro, I find that this solution tends to scale well for my usecase.
Point and Vec2 are defined with the same variable and exactly the same constructor function:
pub struct Point {
pub x: f32,
pub y: f32,
}
pub struct Vec2 {
pub x: f32,
pub y: f32,
}
impl Point {
pub fn new(x: f32, y: f32) -> Self {
Self { x, y }
}
}
impl Vec2 {
pub fn new(x: f32, y: f32) -> Self {
Self { x, y }
}
}
Is it possible to define a trait to implement the constructor function?
So far I found it only possible to define the interface as the internal variables are not known:
pub trait TwoDimensional {
fn new(x: f32, y: f32) -> Self;
}
You can certainly define such a trait, and implement it for your 2 structs, but you will have to write the implementation twice. Even though traits can provide default implementations for functions, the following won't work:
trait TwoDimensional {
fn new(x: f32, y: f32) -> Self {
Self {
x,
y,
}
}
}
The reason why is fairly simple. What happens if you implement this trait for i32 or () or an enum?
Traits fundamentally don't have information about the underlying data structure that implements them. Rust does not support OOP, and trying to force it often leads to ugly, unidiomatic and less performant code.
If however, you have a bunch of structs and want to essentially "write the same impl multiple times without copy/pasting", a macro might be useful. This pattern is common in the standard library, where, for example, there are certain functions that are implemented for all integer types. For example:
macro_rules! impl_constructor {
($name:ty) => {
impl $name {
pub fn new(x: f32, y: f32) -> Self {
Self {
x, y
}
}
}
}
}
impl_constructor!(Point);
impl_constructor!(Vec2);
These macros expand at compile time, so if you do something invalid (e.g. impl_constructor!(i32), you'll get a compilation error, since the macro expansion woudl contain i32 { x, y }.
Personally I only use a macro when there is really a large number of types that need an implementation. This is just personal preference however, there is no runtime difference between a hand-written and a macro-generated impl block.
Is there a memory efficient way to change the behavior on an inherent implementation? At the moment, I can accomplish the change of behavior by storing a number of function pointers, which are then called by the inherent implementation. My difficulty is that there could potentially be a large number of such functions and a large number of objects that depend on these functions, so I'd like to reduce the amount of memory used. As an example, consider the code:
// Holds the data for some process
struct MyData {
x: f64,
y: f64,
fns: MyFns,
}
impl MyData {
// Create a new object
fn new(x: f64, y: f64) -> MyData {
MyData {
x,
y,
fns: CONFIG1,
}
}
// One of our functions
fn foo(&self) -> f64 {
(self.fns.f)(self.x, self.y)
}
// Other function
fn bar(&self) -> f64 {
(self.fns.g)(self.x, self.y)
}
}
// Holds the functions
struct MyFns {
f: fn(x: f64, y: f64) -> f64,
g: fn(x: f64, y: f64) -> f64,
}
// Some functions to use
fn add(x: f64, y: f64) -> f64 {
x + y
}
fn sub(x: f64, y: f64) -> f64 {
x - y
}
fn mul(x: f64, y: f64) -> f64 {
x * y
}
fn div(x: f64, y: f64) -> f64 {
x / y
}
// Create some configurations
const CONFIG1: MyFns = MyFns {
f: add,
g: mul,
};
const CONFIG2: MyFns = MyFns {
f: sub,
g: div,
};
fn main() {
// Create our structure
let mut data = MyData::new(1., 2.);
// Check our functions
println!(
"1: x={}, y={}, foo={}, bar={}",
data.x,
data.y,
data.foo(),
data.bar()
);
// Change the functions
data.fns = CONFIG2;
// Print the functions again
println!(
"2: x={}, y={}, foo={}, bar={}",
data.x,
data.y,
data.foo(),
data.bar()
);
// Change a single function
data.fns.f = add;
// Print the functions again
println!(
"3: x={}, y={}, foo={}, bar={}",
data.x,
data.y,
data.foo(),
data.bar()
);
}
This code allows the behavior of foo and bar to be changed by editing f and g. However, it also not flexible. I'd rather use a boxed trait object Box<dyn Fn(f64,f64)->f64, but then I can't create some default configurations like CONFIG1 and CONFIG2 because Box can not be used to create a constant object. In addition, if we have a large number of functions and objects, I'd like to share the memory for their implementation. For function pointers, this isn't a big deal, but for closures it is. Here, we can't create a constant Rc for the configuration to share the memory. Finally, we could have a static reference to a configuration, which would save memory, but then we could not change the individual functions. I'd rather we have a situation where most of the time we share memory for the functions, but have the ability hold its own memory and change the functions if desired.
I'm open to a better design if one is available. Ultimately, I'd like to change the behavior of foo and bar at runtime based on a function held, in some form or another, inside of MyData. Further, I'd like a way to do so where the memory is shared when possible and we have the ability to change an individual function and not just the entire configuration.
A plain dyn reference will work here - it allows references to objects that have a certain trait but with type known only at runtime.
(This is exactly what you want for function pointers. Think of it as each function having its own special type, but falling under a trait like Fn(f64,f64)->f64.)
So your struct could be defined as:
struct MyData<'a> {
x: f64,
y: f64,
f: &'a dyn Fn(f64, f64) -> f64,
g: &'a dyn Fn(f64, f64) -> f64,
}
(Notice, you need the lifetime specifier 'a to ensure the the lifetime of that references is not shorter than the struct itself.)
Then your impl could be like:
impl<'a> MyData<'a> {
// Create a new object
fn new(x: f64, y: f64) -> Self {
MyData {
x,
y,
f: &add, // f and g as in CONFIG1
g: &mul,
}
}
fn foo(&self) -> f64 {
(self.f)(self.x, self.y)
}
// etc...
}
Depending on how you want the default configurations to work, you could either make them as more inherent functions such as fn to_config2(&mut self); or you could make a separate struct just with the function pointers and then have a function to copy those function pointers into the MyData struct.
To create a default struct, I used to see fn new() -> Self in Rust, but today, I discovered Default. So there are two ways to create a default struct:
struct Point {
x: i32,
y: i32,
}
impl Point {
fn new() -> Self {
Point {
x: 0,
y: 0,
}
}
}
impl Default for Point {
fn default() -> Self {
Point {
x: 0,
y: 0,
}
}
}
fn main() {
let _p1 = Point::new();
let _p2: Point = Default::default();
}
What is the better / the most idiomatic way to do so?
If you had to pick one, implementing the Default trait is the better choice to allow your type to be used generically in more places while the new method is probably what a human trying to use your code directly would look for.
However, your question is a false dichotomy: you can do both, and I encourage you to do so! Of course, repeating yourself is silly, so I'd call one from the other (it doesn't really matter which way):
impl Point {
fn new() -> Self {
Default::default()
}
}
Clippy even has a lint for this exact case!
I use Default::default() in structs that have member data structures where I might change out the implementation. For example, I might be currently using a HashMap but want to switch to a BTreeMap. Using Default::default gives me one less place to change.
In this particular case, you can even derive Default, making it very succinct:
#[derive(Default)]
struct Point {
x: i32,
y: i32,
}
impl Point {
fn new() -> Self {
Default::default()
}
}
fn main() {
let _p1 = Point::new();
let _p2: Point = Default::default();
}