How to generalise access to struct fields? - rust

I try to find differences from two streams (represented by iterators) for later analysis, the code below works just fine, but looks a little bit ugly and error prone (copy-paste!) in updating values in update_v? functions. Is there any ways to generalise it assuming that source is matter?
struct Data {};
struct S {
v1: Option<Data>,
v2: Option<Data>
}
...
fn update_v1(diffs: &mut HashMap<u64, Data>, key: u64, data: Data) {
match diffs.entry(key) {
Entry::Vacant(v) => {
let variant = S {
v1: Some(data),
v2: None
};
v.insert(variant);
},
Entry::Occupied(e) => {
let new_variant = Some(data);
if e.get().v2 == new_variant {
e.remove();
} else {
let existing = e.into_mut();
existing.v1 = new_variant;
}
}
}
}
fn update_v2(diffs: &mut HashMap<u64, Data>, key: u64, data: Data) {
match diffs.entry(key) {
Entry::Vacant(v) => {
let variant = S {
v2: Some(data),
v1: None
};
v.insert(variant);
},
Entry::Occupied(e) => {
let new_variant = Some(data);
if e.get().v1 == new_variant {
e.remove();
} else {
let existing = e.into_mut();
existing.v2 = new_variant;
}
}
}
}

Instead of writing one function for each field, receive a pair of Fns as arguments:
fn(&S) -> Option<Data>, which can be used to replace this condition
if e.get().v1 == new_variant { /* ... */ }
with this
if getter(e.get()) == new_variant { /* ... */ }
fn(&mut S, Option<Data>) -> (), which replaces
existing.v2 = new_variant;
with
setter(&mut existing, new_variant);
Then on the call site you pass a couple lambdas like this
Getter: |d| d.v1
Setter: |s, d| s.v2 = d
Or vice-versa for the other function.
And if you want to keep the update_v1 and update_v2 function names, just write those as wrappers to this new generalized function that automatically pass the proper lambdas.

You can create a trait to facilitate different ways of accessing the structure.
trait SAccessor {
type RV;
fn new(Data) -> S;
fn v2(&S) -> &Self::RV;
fn v1_mut(&mut S) -> &mut Self::RV;
}
struct DirectSAccessor;
impl SAccessor for DirectSAccessor {
type RV = Option<Data>;
fn new(data: Data) -> S {
S {
v1: Some(data),
v2: None
}
}
fn v2(s: &S) -> &Self::RV {
&s.v2
}
fn v1_mut(s: &mut S) -> &mut Self::RV {
&mut s.v1
}
}
fn update<A>(diffs: &mut HashMap<u64, S>, key: u64, data: Data)
where A: SAccessor<RV=Option<Data>>
{
match diffs.entry(key) {
Entry::Vacant(v) => {
let variant = A::new(data);
v.insert(variant);
},
Entry::Occupied(e) => {
let new_variant = Some(data);
if A::v2(e.get()) == &new_variant {
e.remove();
} else {
let existing = e.into_mut();
*A::v1_mut(existing) = new_variant;
}
}
}
}
// ...
// update::<DirectSAccessor>( ... );
Full code

Related

Want to pass a parameterized enum to a function using _ as parameter (like my_fun(MyEnum::Type(_)))

I have this next_expected_kind method that return the next item of an Iterable<Kind> if it is the expected type, or an error if not.
It works fine for non parameterized types like Kind1, but I don't know how to use it if the type that needs parameters like Kind2.
Something like:
let _val = match s.next_expected_kind(Kind::Kind2(str)) {
Ok(k) => str,
_ => panic!("error"),
};
Is there any tricky to make it?
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=d21d5cff42fcca633e95b4915ce2bf1d
#[derive(PartialEq, Eq)]
enum Kind {
Kind1,
Kind2(String),
}
struct S {
kinds: std::vec::IntoIter<Kind>,
}
impl S {
fn next_expected_kind(&mut self, expected: Kind) -> Result<Kind, &str> {
match self.kinds.next() {
Some(k) if k == expected => Ok(k),
_ => Err("not expected"),
}
}
}
fn main() {
let mut s = S {
kinds: vec![Kind::Kind1, Kind::Kind2(String::from("2"))].into_iter(),
};
_ = s.next_expected_kind(Kind::Kind1);
// let _val = s.next_expected_kind(Kind::Kind2(str));
let _val = match s.kinds.next() {
Some(Kind::Kind2(str)) => str,
_ => panic!("not expected"),
};
}
You could use std::mem::discriminant() like this:
use std::mem::{Discriminant, discriminant};
#[derive(Debug, PartialEq, Eq)]
enum Kind {
Kind1,
Kind2(String),
}
struct S {
kinds: std::vec::IntoIter<Kind>,
}
impl S {
fn next_expected_kind(&mut self, expected: Discriminant<Kind>) -> Result<Kind, &str> {
match self.kinds.next() {
Some(k) if discriminant(&k) == expected => Ok(k),
_ => Err("not expected"),
}
}
}
fn main() {
let mut s = S {
kinds: vec![Kind::Kind1, Kind::Kind2(String::from("2"))].into_iter(),
};
_ = dbg!(s.next_expected_kind(discriminant(&Kind::Kind1)));
let _val = dbg!(s.next_expected_kind(discriminant(&Kind::Kind2(String::new()))));
}
The obvious drawback being that you'll have to create an instance with "empty" or default data wherever you want to call it.
The only other way I can think of would be to write a macro since you can't pass just the "variant" of an enum around.
#[derive(Debug, PartialEq, Eq)]
enum Kind {
Kind1(String),
Kind2(i32, i32),
}
struct S {
kinds: std::vec::IntoIter<Kind>,
}
macro_rules! next_expected_kind {
($self:expr, $expected:path) => {
match $self.kinds.next() {
Some(k) if matches!(k, $expected(..)) => Ok(k),
_ => Err("not expected"),
}
}
}
fn main() {
let mut s = S {
kinds: vec![Kind::Kind1(String::from("1")), Kind::Kind2(2,3)].into_iter(),
};
_ = dbg!(next_expected_kind!(&mut s, Kind::Kind1));
let _val = dbg!(next_expected_kind!(&mut s, Kind::Kind2));
}
Note: this has the limitation that all variants have to be tuple variants or struct variants and it's a bit clunky to use.

Wrap type in enum and return reference

type Id = u8;
struct A {
id: Id,
}
struct B {
id: Id,
}
struct C {
id: Id,
}
struct State {
a_vec: Vec<A>,
b_vec: Vec<B>,
c_vec: Vec<C>,
}
impl State {
fn new() -> Self {
Self {
a_vec: Vec::new(),
b_vec: Vec::new(),
c_vec: Vec::new(),
}
}
fn get_e0(&self, id: Id) -> &E0 {
if let Some(a) = self.a_vec.iter().find(|x| x.id==id) {
&E0::A(a)
} else if let Some(b) = self.b_vec.iter().find(|x| x.id==id) {
&E0::B(b)
} else {
panic!("ahh that id doesn't exist everbody panic!!!")
}
}
fn get_e0_mut(&mut self, id: Id) -> &mut E0 {
if let Some(a) = self.a_vec.iter_mut().find(|x| x.id==id) {
&mut E0::A(a)
} else if let Some(b) = self.b_vec.iter_mut().find(|x| x.id==id) {
&mut E0::B(b)
} else {
panic!("ahh that id doesn't exist everbody panic!!!")
}
}
}
enum E0 {
A(A),
B(B),
}
enum E1 {
A(A),
C(C),
}
fn main() {
let state = State::new();
let a0 = A { id: 0 };
let a1 = A { id: 1 };
let b0 = B { id: 2 };
let c0 = C { id: 3 };
state.a_vec.push(a0);
state.a_vec.push(a1);
state.b_vec.push(b0);
state.c_vec.push(c0);
let e5 = state.get_e0(1);
}
I'm looking for a way to implement the function get_e0 and get_e0_mut that wrap several types into an enum so the caller doesn't have to care which of A or B their id relates to, only that they will get an E0. Yet an Vec of E0's seems unfeasible as there might be separate grouping such as E1.
If these functions are not possible then is there another method that could be used to reduce the overhead of searching all the respective Vec's individually each time.
It is guaranteed that the all id's are unique.
You cannot return a reference to a temporary. Instead, you can make your enums generic over their contents. You can therefore use a single enum:
enum E0<T, U> {
A(T),
B(U),
}
You can then use it like this:
fn get_e0(&self, id: Id) -> E0<&A, &B> {
if let Some(a) = self.a_vec.iter().find(|x| x.id == id) {
E0::A(a)
} else if let Some(b) = self.b_vec.iter().find(|x| x.id == id) {
E0::B(b)
} else {
panic!("ahh that id doesn't exist everbody panic!!!")
}
}
fn get_e0_mut(&mut self, id: Id) -> E0<&mut A, &mut B> {
if let Some(a) = self.a_vec.iter_mut().find(|x| x.id == id) {
E0::A(a)
} else if let Some(b) = self.b_vec.iter_mut().find(|x| x.id == id) {
E0::B(b)
} else {
panic!("ahh that id doesn't exist everbody panic!!!")
}
}
Thanks to lifetime elision rules, you don't have to specify lifetimes.
Playground link
Note that if you want to avoid the panic, your return type should express the notion that there can be no value found.
You can for example return an Option:
fn get_e0(&self, id: Id) -> Option<E0<&A, &B>> { ... }
Or alter the enum to have a None variant, similar to Option:
enum E0<T, U> {
A(T),
B(U),
None,
}
And use it like this:
fn get_e0(&self, id: Id) -> E0<&A, &B> {
if let Some(a) = self.a_vec.iter().find(|x| x.id==id) {
E0::A(a)
} else if let Some(b) = self.b_vec.iter().find(|x| x.id==id) {
E0::B(b)
} else {
E0::None
}
}
It is most of the time more idiomatic to express such situations using the type system instead of panicking.

Get a raw vec with field names of any struct with a custom derive macro in Rust

I am trying to write some code that could be able to write a method that returns me a Vec with the names of the fields of a struct.
Code snippet below:
# Don't forget about dependencies if you try to reproduce this on local
use proc_macro2::{Span, Ident};
use quote::quote;
use syn::{
punctuated::Punctuated, token::Comma, Attribute, DeriveInput, Fields, Meta, NestedMeta,
Variant, Visibility,
};
#[proc_macro_derive(StructFieldNames, attributes(struct_field_names))]
pub fn derive_field_names(input: proc_macro::TokenStream) -> proc_macro::TokenStream {
let ast: DeriveInput = syn::parse(input).unwrap();
let (vis, ty, generics) = (&ast.vis, &ast.ident, &ast.generics);
let names_struct_ident = Ident::new(&(ty.to_string() + "FieldStaticStr"), Span::call_site());
let fields = filter_fields(match ast.data {
syn::Data::Struct(ref s) => &s.fields,
_ => panic!("FieldNames can only be derived for structs"),
});
let names_struct_fields = fields.iter().map(|(vis, ident)| {
quote! {
#vis #ident: &'static str
}
});
let mut vec_fields: Vec<String> = Vec::new();
let names_const_fields = fields.iter().map(|(_vis, ident)| {
let ident_name = ident.to_string();
vec_fields.push(ident_name);
quote! {
#vis #ident: -
}
});
let names_const_fields_as_vec = fields.iter().map(|(_vis, ident)| {
let ident_name = ident.to_string();
// vec_fields.push(ident_name)
});
let (impl_generics, ty_generics, where_clause) = generics.split_for_impl();
let tokens = quote! {
#[derive(Debug)]
#vis struct #names_struct_ident {
#(#names_struct_fields),*
}
impl #impl_generics #ty #ty_generics
#where_clause
{
#vis fn get_field_names() -> &'static str {
// stringify!(
[ #(#vec_fields),* ]
.map( |s| s.to_string())
.collect()
// )
}
}
};
tokens.into()
}
fn filter_fields(fields: &Fields) -> Vec<(Visibility, Ident)> {
fields
.iter()
.filter_map(|field| {
if field
.attrs
.iter()
.find(|attr| has_skip_attr(attr, "struct_field_names"))
.is_none()
&& field.ident.is_some()
{
let field_vis = field.vis.clone();
let field_ident = field.ident.as_ref().unwrap().clone();
Some((field_vis, field_ident))
} else {
None
}
})
.collect::<Vec<_>>()
}
const ATTR_META_SKIP: &'static str = "skip";
fn has_skip_attr(attr: &Attribute, path: &'static str) -> bool {
if let Ok(Meta::List(meta_list)) = attr.parse_meta() {
if meta_list.path.is_ident(path) {
for nested_item in meta_list.nested.iter() {
if let NestedMeta::Meta(Meta::Path(path)) = nested_item {
if path.is_ident(ATTR_META_SKIP) {
return true;
}
}
}
}
}
false
}
The code it's taken from here. Basically I just want to get those values as a String, and not to access them via Foo::FIELD_NAMES.some_random_field, because I need them for another process.
How can I achieve that?
Thanks

Union-Find implementation does not update parent tags

I'm trying to create some sets of Strings and then merge some of these sets so that they have the same tag (of type usize). Once I initialize the map, I start adding strings:
self.clusters.make_set("a");
self.clusters.make_set("b");
When I call self.clusters.find("a") and self.clusters.find("b"), different values are returned, which is fine because I haven't merged the sets yet. Then I call the following method to merge two sets
let _ = self.clusters.union("a", "b");
If I call self.clusters.find("a") and self.clusters.find("b") now, I get the same value. However, when I call the finalize() method and try to iterate through the map, the original tags are returned, as if I never merged the sets.
self.clusters.finalize();
for (address, tag) in &self.clusters.map {
self.clusterizer_writer.write_all(format!("{};{}\n", address,
self.clusters.parent[*tag]).as_bytes()).unwrap();
}
// to output all keys with the same tag as a list.
let a: Vec<(usize, Vec<String>)> = {
let mut x = HashMap::new();
for (k, v) in self.clusters.map.clone() {
x.entry(v).or_insert_with(Vec::new).push(k)
}
x.into_iter().collect()
};
I can't figure out why this is the case, but I'm relatively new to Rust; maybe its an issue with pointers?
Instead of "a" and "b", I'm actually using something like utils::arr_to_hex(&input.outpoint.txid) of type String.
This is the Rust implementation of the Union-Find algorithm that I am using:
/// Tarjan's Union-Find data structure.
#[derive(RustcDecodable, RustcEncodable)]
pub struct DisjointSet<T: Clone + Hash + Eq> {
set_size: usize,
parent: Vec<usize>,
rank: Vec<usize>,
map: HashMap<T, usize>, // Each T entry is mapped onto a usize tag.
}
impl<T> DisjointSet<T>
where
T: Clone + Hash + Eq,
{
pub fn new() -> Self {
const CAPACITY: usize = 1000000;
DisjointSet {
set_size: 0,
parent: Vec::with_capacity(CAPACITY),
rank: Vec::with_capacity(CAPACITY),
map: HashMap::with_capacity(CAPACITY),
}
}
pub fn make_set(&mut self, x: T) {
if self.map.contains_key(&x) {
return;
}
let len = &mut self.set_size;
self.map.insert(x, *len);
self.parent.push(*len);
self.rank.push(0);
*len += 1;
}
/// Returns Some(num), num is the tag of subset in which x is.
/// If x is not in the data structure, it returns None.
pub fn find(&mut self, x: T) -> Option<usize> {
let pos: usize;
match self.map.get(&x) {
Some(p) => {
pos = *p;
}
None => return None,
}
let ret = DisjointSet::<T>::find_internal(&mut self.parent, pos);
Some(ret)
}
/// Implements path compression.
fn find_internal(p: &mut Vec<usize>, n: usize) -> usize {
if p[n] != n {
let parent = p[n];
p[n] = DisjointSet::<T>::find_internal(p, parent);
p[n]
} else {
n
}
}
/// Union the subsets to which x and y belong.
/// If it returns Ok<u32>, it is the tag for unified subset.
/// If it returns Err(), at least one of x and y is not in the disjoint-set.
pub fn union(&mut self, x: T, y: T) -> Result<usize, ()> {
let x_root;
let y_root;
let x_rank;
let y_rank;
match self.find(x) {
Some(x_r) => {
x_root = x_r;
x_rank = self.rank[x_root];
}
None => {
return Err(());
}
}
match self.find(y) {
Some(y_r) => {
y_root = y_r;
y_rank = self.rank[y_root];
}
None => {
return Err(());
}
}
// Implements union-by-rank optimization.
if x_root == y_root {
return Ok(x_root);
}
if x_rank > y_rank {
self.parent[y_root] = x_root;
return Ok(x_root);
} else {
self.parent[x_root] = y_root;
if x_rank == y_rank {
self.rank[y_root] += 1;
}
return Ok(y_root);
}
}
/// Forces all laziness, updating every tag.
pub fn finalize(&mut self) {
for i in 0..self.set_size {
DisjointSet::<T>::find_internal(&mut self.parent, i);
}
}
}
I think you're just not extracting the information out of your DisjointSet struct correctly.
I got sniped by this and implemented union find. First, with a basic usize implemention:
pub struct UnionFinderImpl {
parent: Vec<usize>,
}
Then with a wrapper for more generic types:
pub struct UnionFinder<T: Hash> {
rev: Vec<Rc<T>>,
fwd: HashMap<Rc<T>, usize>,
uf: UnionFinderImpl,
}
Both structs implement a groups() method that returns a Vec<Vec<>> of groups. Clone isn't required because I used Rc.
Playground

Polymorphism in Rust and trait references (trait objects?)

I'm writing a process memory scanner with a console prompt interface in Rust.
I need scanner types such as a winapi scanner or a ring0 driver scanner so I'm trying to implement polymorphism.
I have the following construction at this moment:
pub trait Scanner {
fn attach(&mut self, pid: u32) -> bool;
fn detach(&mut self);
}
pub struct WinapiScanner {
pid: u32,
hprocess: HANDLE,
addresses: Vec<usize>
}
impl WinapiScanner {
pub fn new() -> WinapiScanner {
WinapiScanner {
pid: 0,
hprocess: 0 as HANDLE,
addresses: Vec::<usize>::new()
}
}
}
impl Scanner for WinapiScanner {
fn attach(&mut self, pid: u32) -> bool {
let handle = unsafe { OpenProcess(PROCESS_ALL_ACCESS, FALSE, pid) };
if handle == 0 as HANDLE {
self.pid = pid;
self.hprocess = handle;
true
} else {
false
}
}
fn detach(&mut self) {
unsafe { CloseHandle(self.hprocess) };
self.pid = 0;
self.hprocess = 0 as HANDLE;
self.addresses.clear();
}
}
In future, I'll have some more scanner types besides WinapiScanner, so, if I understand correctly, I should use a trait reference (&Scanner) to implement polymorphism. I'm trying to create Scanner object like this (note the comments):
enum ScannerType {
Winapi
}
pub fn start() {
let mut scanner: Option<&mut Scanner> = None;
let mut scanner_type = ScannerType::Winapi;
loop {
let line = prompt();
let tokens: Vec<&str> = line.split_whitespace().collect();
match tokens[0] {
// commands
"scanner" => {
if tokens.len() != 2 {
println!("\"scanner\" command takes 1 argument")
} else {
match tokens[1] {
"list" => {
println!("Available scanners: winapi");
},
"winapi" => {
scanner_type = ScannerType::Winapi;
println!("Scanner type set to: winapi");
},
x => {
println!("Unknown scanner type: {}", x);
}
}
}
},
"attach" => {
if tokens.len() > 1 {
match tokens[1].parse::<u32>() {
Ok(pid) => {
scanner = match scanner_type {
// ----------------------
// Problem goes here.
// Object, created by WinapiScanner::new() constructor
// doesn't live long enough to borrow it here
ScannerType::Winapi => Some(&mut WinapiScanner::new())
// ----------------------
}
}
Err(_) => {
println!("Wrong pid");
}
}
}
},
x => println!("Unknown command: {}", x)
}
}
}
fn prompt() -> String {
use std::io::Write;
use std::io::BufRead;
let stdout = io::stdout();
let mut lock = stdout.lock();
let _ = lock.write(">> ".as_bytes());
let _ = lock.flush();
let stdin = io::stdin();
let mut lock = stdin.lock();
let mut buf = String::new();
let _ = lock.read_line(&mut buf);
String::from(buf.trim())
}
It's not a full program; I've pasted important parts only.
What am I doing wrong and how do I implement what I want in Rust?
Trait objects must be used behind a pointer. But references are not the only kind of pointers; Box is also a pointer!
let mut scanner: Option<Box<Scanner>> = None;
scanner = match scanner_type {
ScannerType::Winapi => Some(Box::new(WinapiScanner::new()))
}

Resources