How to sort a vector containing structs? - rust

Lets say I have some code like:
struct GenericStruct {
a: u8,
b: String,
}
fn sort_array(generic_vector: Vec<GenericStruct>) -> Vec<GenericStruct> {
// Some code here to sort a vector.
todo!();
}
fn main() {
let some_words = String::from("Hello Word");
let x = GenericStruct { a: 25, b: some_words };
let some_vector: Vec<GenericStruct> = vec![x];
}
How could I sort vectors based on one part of it, such as sorting by "a", or sorting by the length of "b"?

Two possiblities.
Either implement the Ord trait for your struct, or use the sort_unstable_by_key method.
You'd use the former if, for your generic_struct, there is an obvious and single way to sort them that makes sense not just in your current sorting use case but generally.
You'd use the latter if this sorting scheme is more of a "one off".
somevector.sort_unstable_by_key(|element| element.a)

Related

How to create a proc macro that can read a const generic?

I wanted a way to create an array with non-copy values. So I came up with the following:
use proc_macro::TokenStream;
use quote::quote;
use syn::parse::{Parse, ParseStream, Result};
use syn::{parse_macro_input, Expr, LitInt, Token};
struct ArrayLit(Expr, LitInt);
impl Parse for ArrayLit {
fn parse(input: ParseStream) -> Result<Self> {
let v: Expr = input.parse()?;
let _ = input.parse::<Token![;]>()?;
let n: LitInt = input.parse()?;
Ok(ArrayLit(v, n))
}
}
#[proc_macro]
pub fn arr(input: TokenStream) -> TokenStream {
let arr = parse_macro_input!(input as ArrayLit);
let items = std::iter::repeat(arr.0).take(
arr.1.base10_parse::<usize>().expect("error parsing array length"),
);
(quote! { [#(#items),*] }).into()
}
This works with numeric literal sizes, like:
fn f() -> [Option<u32>; 10] {
let mut it = 0..5;
arr![it.next(); 10]
}
How do I change this proc-macro code so that it will take a const generic? For example, I'd like it to work with the following function:
fn f<const N: usize>() -> [Option<u32>; N] {
let mut it = 0..5;
arr![it.next(); N]
}
This is not possible as written. Your macro works by repeating a fragment of code, which necessarily happens before the program is parsed. Generics, including const generics are monomorphized (converted into code with, in this case, a specific value for N) well after the program has been parsed, type-checked, and so on.
You will have to use a different strategy. For example, you could have your macro generate a loop which loops N times.
By the way, if you're not already aware of it, check out std::array::from_fn() which allows constructing an array from repeatedly calling a function (so, more or less the same effect as your macro).

Look up a struct from a user input string

I am writing a program which will receive user input from CSV or JSON (that doesn't really matter). There are potentially many inputs (each line of a CSV for example), which would reference different structs. So, I need to return an instance of a struct for each input string, but I don't know upfront which struct that would be. My attempt (code doesn't compile):
fn main () {
let zoo: Vec<Box<dyn Animal>>;
let user_input = "Cat,Persik";
let user_input = user_input.split(",");
match user_input.nth(0) {
"Cat" => zoo.push(Cat(user_input.nth(0))),
_ => zoo.push(Dog(user_input.nth(0))) //here user would be expected to provide a u8
}
}
trait Animal {}
struct Dog {
age: u8,
}
impl Animal for Dog {}
struct Cat {
name: String,
}
impl Animal for Cat {}
One way to do it is with if statements like this. But if there are hundreds of animals that would make the code pretty ugly. I have a macro which returns struct name for an instance, but I couldn't figure out a way to use that. I also thought about using enum for this, but couldn't figure out either.
Is there a shorter and more concise way of doing this?
Doesn't this way limit me in using only methods defined in the Animal trait on items of zoo? If so, is there a way around this constraint?
Essentially, I want to get a vector of structs, and to be able to use their methods freely. I don't know how many there will be, and I don't know in advance which structs exactly.
It's often helpful to use helper functions for parsing things. We can implement this function on the trait itself to keep the parsing function associated with the trait.
We'll have the function return Result<Box<dyn animal>, ()> since it's possible for parsing to fail. (We'd probably want a proper error type instead of () in real code.)
trait animal{}
impl dyn animal {
fn try_parse(kind: &str, data: &str) -> Result<Box<dyn animal>, ()> {
match kind {
"Cat" => Ok(Box::new(Cat { name: data.into() })),
"Dog" => Ok(Box::new(Dog { age: data.parse().map_err(|_e| ())? })),
_ => Err(()),
}
}
}
Ok, so now we have a function that can be used to parse a single animal, and has a way to signal failure. We could now parse a comma-separated string building off of this function, again signaling errors if the string doesn't contain a comma:
impl dyn animal {
// try_parse()
fn try_parse_comma_separated(input: &str) -> Result<Box<dyn animal>, ()> {
let split = input.split(',');
let parts = (split.next(), split.next());
// parts is a tuple of two Option<&str>. We can only proceed if both
// are Some.
match parts {
(Some(kind), Some(data)) => Self::try_parse(kind, data),
_ => Err(()),
}
}
}
Now our main() is trivial:
fn main() {
let mut zoo: Vec<Box<dyn animal>> = vec![];
let user_input = "Cat,Persik";
zoo.push(<dyn animal>::try_parse_comma_separated(user_input).unwrap());
}
Separating things out like this allows us to reuse these functions in other interesting ways. Let's say you wanted to parse a string like "Cat,Persik,Dog,5" as two values. That can now be done by using iterators and mapping over our parse function:
fn main() {
let user_input = "Cat,Persik,Dog,5";
let zoo = user_input.split(',').collect::<Vec<_>>()
.chunks_exact(2) // Group the input into slices of 2 elements each
.map(|s| <dyn animal>::try_parse(s[0], s[1]).unwrap())
.collect::<Vec<_>>();
}
To answer your question about a better way to do this when managing many implementors of animal, you could move the implementation-specific parsing logic into a similar function on each implementation instead, and call that functionality from <dyn animal>::try_parse(). The parsing logic has to live somewhere.
Doesn't this way limit me in using only methods defined in animal Trait on items of zoo? If so, is there a way around this constraint?
Without downcasting, yes. Generally when you have a collection of polymorphic values like dyn animal, you want to use them polymorphically -- invoking only methods defined on the animal trait. Each implementation of the trait on a specific type can implement the trait's interface however it makes sense for that animal.
Downcasting is non-trivial, but with a helper trait it becomes a bit more palatable:
trait AsAny {
fn as_any(&self) -> &dyn Any;
}
impl<T: 'static + animal> AsAny for T {
fn as_any(&self) -> &dyn Any { self }
}
trait animal: AsAny { }
Now, given an animal: Box<dyn Animal> you can use animal.as_any().downcast_ref::<Dog>() for example, which gives you back an Option<&Dog>. This will be None if the boxed animal isn't a dog. Based on the zoo in the last example (with a dog and a cat):
let dogs = zoo.iter()
// Filter down the zoo to just dogs (produces a sequence of &Dog)
.filter_map(|animal| animal.as_any().downcast_ref::<Dog>());
// We should only find one dog in the zoo.
assert_eq!(dogs.count(), 1);
But this should be an absolute last resort when using your animals polymorphically isn't an option.

How to use fold without making the list mutable?

I have provided an example here:
#[derive(Default)]
struct Example {
name: String,
description: String,
}
fn main() {
let lines = vec![
"N: N1",
"D: D1",
"",
"N: N2",
"D: D2",
"",
];
let data = lines.into_iter().fold(Vec::<Example>::new(), |acc, line| {
let mut examples = acc;
match line.chars().collect::<Vec<char>>().as_slice() {
['N', ..] => {
let mut element:Example = Default::default();
element.name = line[2..].into();
examples.push(element);
}
['D', ..] => {
let mut element = examples.pop().unwrap();
element.description = line[2..].into();
examples.push(element);
}
&[_, ..] => {}
&[] => {}
}
return examples;
});
for example in data{
println!("Name: {}, Description: {}", example.name, example.description);
}
}
Playground
Basically, I will be processing a steam of lines (the amount unknown at runtime, I have used an array here for the purpose of the example) and I want to build up a struct with the information and when I reach a given termination point, I start a new struct and add it to the list.
My first attempts at this used an outer most mutable list. I then discovered the fold method which seemed more elegant (IMO) but I still have to make the list mutable inside.
What would be a better way of achieving this and/or how could I remove the need to make the list mutable?
If you always have this same structure (only 2 fields, 3 rows per record), and you can have 2 independent iterators over the data,
it is possible to do a trick:
let names = lines.iter();
let descriptions = lines.iter().skip(1);
let name_desc_pairs = names.zip(descriptions).step_by(3);
let examples = name_desc_pairs.map(parse_example);
Where fn parse_example(lines: (&String, &String)) -> Example would take a pair of (name_line, description_line) strings and construct an Example.
Otherwise if you want arbitrary number of fields, consider that while you iterate over lines, at first you only get a partial example, so some buffering of the partial state is needed. There are no methods for that in the standard Iterator.
There's chunks method in the futures crate if you can use that: stream::iter(lines).chunks(3) spits out vectors of 3 lines, each of which you can parse into an Example.
Without that it's possible to implement your own buffering & parsing Iterator.
The idea is that the iterator state contains a partial example, e.g.:
struct ExampleBuilder {
name: Option<String>,
description: Option<String>,
}
and wraps the original iterator. In its next() method it calls next() of the original iterator a few times, and either adds line data to ExampleBuilder, or when it gets "" separator - converts ExampleBuilder to Example and returns it.

Rust macro that counts and generates repetitive struct fields

I want to write a macro that generates varying structs from an integer argument. For example, make_struct!(3) might generate something like this:
pub struct MyStruct3 {
field_0: u32,
field_1: u32,
field_2: u32
}
What's the best way to transform that "3" literal into a number that I can use to generate code? Should I be using macro_rules! or a proc-macro?
You need a procedural attribute macro and quite a bit of pipework. An example implementation is on Github; bear in mind that it is pretty rough around the edges, but works pretty nicely to start with.
The aim is to have the following:
#[derivefields(u32, "field", 3)]
struct MyStruct {
foo: u32
}
transpile to:
struct MyStruct {
pub field_0: u32,
pub field_1: u32,
pub field_2: u32,
foo: u32
}
To do this, first, we're going to establish a couple of things. We're going to need a struct to easily store and retrieve our arguments:
struct MacroInput {
pub field_type: syn::Type,
pub field_name: String,
pub field_count: u64
}
The rest is pipework:
impl Parse for MacroInput {
fn parse(input: ParseStream) -> syn::Result<Self> {
let field_type = input.parse::<syn::Type>()?;
let _comma = input.parse::<syn::token::Comma>()?;
let field_name = input.parse::<syn::LitStr>()?;
let _comma = input.parse::<syn::token::Comma>()?;
let count = input.parse::<syn::LitInt>()?;
Ok(MacroInput {
field_type: field_type,
field_name: field_name.value(),
field_count: count.base10_parse().unwrap()
})
}
}
This defines syn::Parse on our struct and allows us to use syn::parse_macro_input!() to easily parse our arguments.
#[proc_macro_attribute]
pub fn derivefields(attr: TokenStream, item: TokenStream) -> TokenStream {
let input = syn::parse_macro_input!(attr as MacroInput);
let mut found_struct = false; // We actually need a struct
item.into_iter().map(|r| {
match &r {
&proc_macro::TokenTree::Ident(ref ident) if ident.to_string() == "struct" => { // react on keyword "struct" so we don't randomly modify non-structs
found_struct = true;
r
},
&proc_macro::TokenTree::Group(ref group) if group.delimiter() == proc_macro::Delimiter::Brace && found_struct == true => { // Opening brackets for the struct
let mut stream = proc_macro::TokenStream::new();
stream.extend((0..input.field_count).fold(vec![], |mut state:Vec<proc_macro::TokenStream>, i| {
let field_name_str = format!("{}_{}", input.field_name, i);
let field_name = Ident::new(&field_name_str, Span::call_site());
let field_type = input.field_type.clone();
state.push(quote!(pub #field_name: #field_type,
).into());
state
}).into_iter());
stream.extend(group.stream());
proc_macro::TokenTree::Group(
proc_macro::Group::new(
proc_macro::Delimiter::Brace,
stream
)
)
}
_ => r
}
}).collect()
}
The behavior of the modifier creates a new TokenStream and adds our fields first. This is extremely important; assume that the struct provided is struct Foo { bar: u8 }; appending last would cause a parse error due to a missing ,. Prepending allows us to not have to care about this, since a trailing comma in a struct is not a parse error.
Once we have this TokenStream, we successively extend() it with the generated tokens from quote::quote!(); this allows us to not have to build the token fragments ourselves. One gotcha is that the field name needs to be converted to an Ident (it gets quoted otherwise, which isn't something we want).
We then return this modified TokenStream as a TokenTree::Group to signify that this is indeed a block delimited by brackets.
In doing so, we also solved a few problems:
Since structs without named members (pub struct Foo(u32) for example) never actually have an opening bracket, this macro is a no-op for this
It will no-op any item that isn't a struct
It will also no-op structs without a member

Moving array values between enum variations

My problem is following. I have enum with several variants that use increasing number of items. For simplicity I'll reduce the numbers to first two:
#[derive(Debug)]
pub enum Variant<A> {
Single([A; 1]),
Double([A; 2]),
}
I want to create special methods which would preferably transform Single into Double. For example if I call push_front(a) on Single([x]) I need to get back Double([a,x]. One way I could do it is:
impl<A: Copy> Variant<A> {
fn push_front(&mut self, value: A) {
self* = match self {
&mut Single(b) => Double([value, b[0]]),
_ => panic!("Can't convert"),
};
}
}
Is there a way to achieve similar effect without A having to implement Copy?
Playground link: http://is.gd/i0bQtl
You could change the constraint from Copy to Clone; then, the match arm would become:
&mut Single(ref b) => Double([value, b[0].clone()]),
On nightly you can use the "slice_pattern" syntax:
Single([one]) => Double([value, one]),
PlayPen

Resources