I wish that enums in Rust can be used like Haskell's productive type. I want to
access a field's value directly
assign a field's value directly or make a clone with the changing value.
Directly means that not using too long pattern matching code, but just could access like let a_size = a.size.
In Haskell:
data TypeAB = A {size::Int, name::String} | B {size::Int, switch::Bool} deriving Show
main = do
let a = A 1 "abc"
let b = B 1 True
print (size a) -- could access a field's value directly
print (name a) -- could access a field's value directly
print (switch b) -- could access a field's value directly
let aa = a{size=2} -- could make a clone directly with the changing value
print aa
I tried two styles of Rust enum definition like
Style A:
#[derive(Debug)]
enum EntryType {
A(TypeA),
B(TypeB),
}
#[derive(Debug)]
struct TypeA {
size: u32,
name: String,
}
#[derive(Debug)]
struct TypeB {
size: u32,
switch: bool,
}
fn main() {
let mut ta = TypeA {
size: 3,
name: "TAB".to_string(),
};
println!("{:?}", &ta);
ta.size = 2;
ta.name = "TCD".to_string();
println!("{:?}", &ta);
let mut ea = EntryType::A(TypeA {
size: 1,
name: "abc".to_string(),
});
let mut eb = EntryType::B(TypeB {
size: 1,
switch: true,
});
let vec_ab = vec![&ea, &eb];
println!("{:?}", &ea);
println!("{:?}", &eb);
println!("{:?}", &vec_ab);
// Want to do like `ta.size = 2` for ea
// Want to do like `ta.name = "bcd".to_string()` for ea
// Want to do like `tb.switch = false` for eb
// ????
println!("{:?}", &ea);
println!("{:?}", &eb);
println!("{:?}", &vec_ab);
}
Style B:
#[derive(Debug)]
enum TypeCD {
TypeC { size: u32, name: String },
TypeD { size: u32, switch: bool },
}
fn main() {
// NOTE: Rust requires representative struct name before each constructor
// TODO: Check constructor name can be duplicated
let mut c = TypeCD::TypeC {
size: 1,
name: "abc".to_string(),
};
let mut d = TypeCD::TypeD {
size: 1,
switch: true,
};
let vec_cd = vec![&c, &d];
println!("{:?}", &c);
println!("{:?}", &d);
println!("{:?}", &vec_cd);
// Can't access a field's value like
// let c_size = c.size
let c_size = c.size; // [ERROR]: No field `size` on `TypeCD`
let c_name = c.name; // [ERROR]: No field `name` on `TypeCD`
let d_switch = d.switch; // [ERROR]: No field `switch` on `TypeCD`
// Can't change a field's value like
// c.size = 2;
// c.name = "cde".to_string();
// d.switch = false;
println!("{:?}", &c);
println!("{:?}", &d);
println!("{:?}", &vec_cd);
}
I couldn't access/assign values directly in any style. Do I have to implement functions or a trait just to access a field's value? Is there some way of deriving things to help this situation?
What about style C:
#[derive(Debug)]
enum Color {
Green { name: String },
Blue { switch: bool },
}
#[derive(Debug)]
struct Something {
size: u32,
color: Color,
}
fn main() {
let c = Something {
size: 1,
color: Color::Green {
name: "green".to_string(),
},
};
let d = Something {
size: 2,
color: Color::Blue { switch: true },
};
let vec_cd = vec![&c, &d];
println!("{:?}", &c);
println!("{:?}", &d);
println!("{:?}", &vec_cd);
let _ = c.size;
}
If all variant have something in common, why separate them?
Of course, I need to access not common field too.
This would imply that Rust should define what to do when the actual type at runtime doesn't contain the field you required. So, I don't think Rust would add this one day.
You could do it yourself. It will require some lines of code, but that matches the behavior of your Haskell code. However, I don't think this is the best thing to do. Haskell is Haskell, I think you should code in Rust and not try to code Haskell by using Rust. That a general rule, some feature of Rust come directly from Haskell, but what you want here is very odd in my opinion for Rust code.
#[derive(Debug)]
enum Something {
A { size: u32, name: String },
B { size: u32, switch: bool },
}
impl Something {
fn size(&self) -> u32 {
match self {
Something::A { size, .. } => *size,
Something::B { size, .. } => *size,
}
}
fn name(&self) -> &String {
match self {
Something::A { name, .. } => name,
Something::B { .. } => panic!("Something::B doesn't have name field"),
}
}
fn switch(&self) -> bool {
match self {
Something::A { .. } => panic!("Something::A doesn't have switch field"),
Something::B { switch, .. } => *switch,
}
}
fn new_size(&self, size: u32) -> Something {
match self {
Something::A { name, .. } => Something::A {
size,
name: name.clone(),
},
Something::B { switch, .. } => Something::B {
size,
switch: *switch,
},
}
}
// etc...
}
fn main() {
let a = Something::A {
size: 1,
name: "Rust is not haskell".to_string(),
};
println!("{:?}", a.size());
println!("{:?}", a.name());
let b = Something::B {
size: 1,
switch: true,
};
println!("{:?}", b.switch());
let aa = a.new_size(2);
println!("{:?}", aa);
}
I think there is currently no built-in way of accessing size directly on the enum type. Until then, enum_dispatch or a macro-based solution may help you.
Related
I've the following code I implemented to return an error when a string doesn't contain a match to something like "Mark, 55" to return an error. The compiler complains about my code Err(error) => Err(ParsePersonError::ParseInt(_))
use std::num::ParseIntError;
use std::str::FromStr;
#[derive(Debug, PartialEq)]
struct Person {
name: String,
age: usize,
}
// We will use this error type for the `FromStr` implementation.
#[derive(Debug, PartialEq)]
enum ParsePersonError {
// Empty input string
Empty,
// Incorrect number of fields
BadLen,
// Empty name field
NoName,
// Wrapped error from parse::<usize>()
ParseInt(ParseIntError),
}
// My implementation
impl FromStr for Person {
type Err = ParsePersonError;
fn from_str(s: &str) -> Result<Person, Self::Err> {
if s.len() == 0 {
Err(ParsePersonError::Empty)
}
else {
let v: Vec<&str> = s.split(",").collect();
println!("{:?}",v);
if &v[0]== &""{
Err(ParsePersonError::NoName)
}
else if v.len()!=2 {
Err(ParsePersonError::BadLen)
}
else {
let num = match v[1].parse::<usize>() {
Ok(n) => {
let name = v[0].to_string();
let age = n;
return Ok(Person {name,age})
},
Err(error) => Err(ParsePersonError::ParseInt(_))
};
Err(ParsePersonError::ParseInt(_))
}
}
}
}
However, the compiler doesn't complain about the same code in my test case.
ParsePersonError::ParseInt(_))
#[test]
fn missing_name_and_age() {
assert!(matches!(
",".parse::<Person>(),
Err(ParsePersonError::NoName | ParsePersonError::ParseInt(_))
));
}
What am I missing?
If you de-sugar the match macro in test, it is something like
let result = ",".parse::<Person>();
let r = match result {
Err(ParsePersonError::NoName) => true,
Err(ParsePersonError::ParseInt(_)) => true,
_ => false,
};
assert!(r);
Essentially the '_' is to hold the value attached to ParseInt. Hence an assignment. This is what the compiler error message means.
type Id = u8;
struct A {
id: Id,
}
struct B {
id: Id,
}
struct C {
id: Id,
}
struct State {
a_vec: Vec<A>,
b_vec: Vec<B>,
c_vec: Vec<C>,
}
impl State {
fn new() -> Self {
Self {
a_vec: Vec::new(),
b_vec: Vec::new(),
c_vec: Vec::new(),
}
}
fn get_e0(&self, id: Id) -> &E0 {
if let Some(a) = self.a_vec.iter().find(|x| x.id==id) {
&E0::A(a)
} else if let Some(b) = self.b_vec.iter().find(|x| x.id==id) {
&E0::B(b)
} else {
panic!("ahh that id doesn't exist everbody panic!!!")
}
}
fn get_e0_mut(&mut self, id: Id) -> &mut E0 {
if let Some(a) = self.a_vec.iter_mut().find(|x| x.id==id) {
&mut E0::A(a)
} else if let Some(b) = self.b_vec.iter_mut().find(|x| x.id==id) {
&mut E0::B(b)
} else {
panic!("ahh that id doesn't exist everbody panic!!!")
}
}
}
enum E0 {
A(A),
B(B),
}
enum E1 {
A(A),
C(C),
}
fn main() {
let state = State::new();
let a0 = A { id: 0 };
let a1 = A { id: 1 };
let b0 = B { id: 2 };
let c0 = C { id: 3 };
state.a_vec.push(a0);
state.a_vec.push(a1);
state.b_vec.push(b0);
state.c_vec.push(c0);
let e5 = state.get_e0(1);
}
I'm looking for a way to implement the function get_e0 and get_e0_mut that wrap several types into an enum so the caller doesn't have to care which of A or B their id relates to, only that they will get an E0. Yet an Vec of E0's seems unfeasible as there might be separate grouping such as E1.
If these functions are not possible then is there another method that could be used to reduce the overhead of searching all the respective Vec's individually each time.
It is guaranteed that the all id's are unique.
You cannot return a reference to a temporary. Instead, you can make your enums generic over their contents. You can therefore use a single enum:
enum E0<T, U> {
A(T),
B(U),
}
You can then use it like this:
fn get_e0(&self, id: Id) -> E0<&A, &B> {
if let Some(a) = self.a_vec.iter().find(|x| x.id == id) {
E0::A(a)
} else if let Some(b) = self.b_vec.iter().find(|x| x.id == id) {
E0::B(b)
} else {
panic!("ahh that id doesn't exist everbody panic!!!")
}
}
fn get_e0_mut(&mut self, id: Id) -> E0<&mut A, &mut B> {
if let Some(a) = self.a_vec.iter_mut().find(|x| x.id == id) {
E0::A(a)
} else if let Some(b) = self.b_vec.iter_mut().find(|x| x.id == id) {
E0::B(b)
} else {
panic!("ahh that id doesn't exist everbody panic!!!")
}
}
Thanks to lifetime elision rules, you don't have to specify lifetimes.
Playground link
Note that if you want to avoid the panic, your return type should express the notion that there can be no value found.
You can for example return an Option:
fn get_e0(&self, id: Id) -> Option<E0<&A, &B>> { ... }
Or alter the enum to have a None variant, similar to Option:
enum E0<T, U> {
A(T),
B(U),
None,
}
And use it like this:
fn get_e0(&self, id: Id) -> E0<&A, &B> {
if let Some(a) = self.a_vec.iter().find(|x| x.id==id) {
E0::A(a)
} else if let Some(b) = self.b_vec.iter().find(|x| x.id==id) {
E0::B(b)
} else {
E0::None
}
}
It is most of the time more idiomatic to express such situations using the type system instead of panicking.
Suppose my library definespub struct MyStruct { a: i32 }. Then, suppose I define a procedural macro my_macro!. Perhaps I want my_macro!(11) to expand to MyStruct { a: 11 }. This proves that I must have used my_macro! to obtain MyStruct, since the a field is not public.
This is what I want to do:
lib.rs:
struct MyStruct {
a: i32,
}
#[proc_macro]
pub fn my_macro(input: TokenStream) -> TokenStream {
let ast: syn::Expr = syn::parse(input).unwrap();
impl_my_macro(ast)
}
fn impl_my_macro(expr: syn::Expr) -> TokenStream {
let gen = match expr {
syn::Expr::Lit(expr_lit) => match expr_lit.lit {
syn::Lit::Int(lit_int) => {
let value = lit_int.base10_parse::<i32>().unwrap();
if value > 100 {
quote::quote! {
compile_error("Integer literal is too big.");
}
} else {
quote::quote! {
MyStruct{a: #lit_int}
}
}
}
_ => quote::quote! {
compile_error!("Expected an integer literal.");
},
},
_ => quote::quote! {
compile_error!("Expected an integer literal.");
},
};
gen.into()
}
And I would use it like this:
fn test_my_macro() {
let my_struct = secret_macro::my_macro!(10);
}
But the compiler gives this warning:
error[E0422]: cannot find struct, variant or union type `MyStruct` in this scope
--> tests/test.rs:14:21
|
14 | secret_macro::my_macro!(10);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ not found in this scope
Is it possible to achieve this in Rust?
I'm trying to create some sets of Strings and then merge some of these sets so that they have the same tag (of type usize). Once I initialize the map, I start adding strings:
self.clusters.make_set("a");
self.clusters.make_set("b");
When I call self.clusters.find("a") and self.clusters.find("b"), different values are returned, which is fine because I haven't merged the sets yet. Then I call the following method to merge two sets
let _ = self.clusters.union("a", "b");
If I call self.clusters.find("a") and self.clusters.find("b") now, I get the same value. However, when I call the finalize() method and try to iterate through the map, the original tags are returned, as if I never merged the sets.
self.clusters.finalize();
for (address, tag) in &self.clusters.map {
self.clusterizer_writer.write_all(format!("{};{}\n", address,
self.clusters.parent[*tag]).as_bytes()).unwrap();
}
// to output all keys with the same tag as a list.
let a: Vec<(usize, Vec<String>)> = {
let mut x = HashMap::new();
for (k, v) in self.clusters.map.clone() {
x.entry(v).or_insert_with(Vec::new).push(k)
}
x.into_iter().collect()
};
I can't figure out why this is the case, but I'm relatively new to Rust; maybe its an issue with pointers?
Instead of "a" and "b", I'm actually using something like utils::arr_to_hex(&input.outpoint.txid) of type String.
This is the Rust implementation of the Union-Find algorithm that I am using:
/// Tarjan's Union-Find data structure.
#[derive(RustcDecodable, RustcEncodable)]
pub struct DisjointSet<T: Clone + Hash + Eq> {
set_size: usize,
parent: Vec<usize>,
rank: Vec<usize>,
map: HashMap<T, usize>, // Each T entry is mapped onto a usize tag.
}
impl<T> DisjointSet<T>
where
T: Clone + Hash + Eq,
{
pub fn new() -> Self {
const CAPACITY: usize = 1000000;
DisjointSet {
set_size: 0,
parent: Vec::with_capacity(CAPACITY),
rank: Vec::with_capacity(CAPACITY),
map: HashMap::with_capacity(CAPACITY),
}
}
pub fn make_set(&mut self, x: T) {
if self.map.contains_key(&x) {
return;
}
let len = &mut self.set_size;
self.map.insert(x, *len);
self.parent.push(*len);
self.rank.push(0);
*len += 1;
}
/// Returns Some(num), num is the tag of subset in which x is.
/// If x is not in the data structure, it returns None.
pub fn find(&mut self, x: T) -> Option<usize> {
let pos: usize;
match self.map.get(&x) {
Some(p) => {
pos = *p;
}
None => return None,
}
let ret = DisjointSet::<T>::find_internal(&mut self.parent, pos);
Some(ret)
}
/// Implements path compression.
fn find_internal(p: &mut Vec<usize>, n: usize) -> usize {
if p[n] != n {
let parent = p[n];
p[n] = DisjointSet::<T>::find_internal(p, parent);
p[n]
} else {
n
}
}
/// Union the subsets to which x and y belong.
/// If it returns Ok<u32>, it is the tag for unified subset.
/// If it returns Err(), at least one of x and y is not in the disjoint-set.
pub fn union(&mut self, x: T, y: T) -> Result<usize, ()> {
let x_root;
let y_root;
let x_rank;
let y_rank;
match self.find(x) {
Some(x_r) => {
x_root = x_r;
x_rank = self.rank[x_root];
}
None => {
return Err(());
}
}
match self.find(y) {
Some(y_r) => {
y_root = y_r;
y_rank = self.rank[y_root];
}
None => {
return Err(());
}
}
// Implements union-by-rank optimization.
if x_root == y_root {
return Ok(x_root);
}
if x_rank > y_rank {
self.parent[y_root] = x_root;
return Ok(x_root);
} else {
self.parent[x_root] = y_root;
if x_rank == y_rank {
self.rank[y_root] += 1;
}
return Ok(y_root);
}
}
/// Forces all laziness, updating every tag.
pub fn finalize(&mut self) {
for i in 0..self.set_size {
DisjointSet::<T>::find_internal(&mut self.parent, i);
}
}
}
I think you're just not extracting the information out of your DisjointSet struct correctly.
I got sniped by this and implemented union find. First, with a basic usize implemention:
pub struct UnionFinderImpl {
parent: Vec<usize>,
}
Then with a wrapper for more generic types:
pub struct UnionFinder<T: Hash> {
rev: Vec<Rc<T>>,
fwd: HashMap<Rc<T>, usize>,
uf: UnionFinderImpl,
}
Both structs implement a groups() method that returns a Vec<Vec<>> of groups. Clone isn't required because I used Rc.
Playground
I'm rather new to Rust and have put together a little experiment that blows my understanding of annotations entirely out of the water. This is compiled with rust-0.13.0-nightly and there's a playpen version of the code here.
The meat of the program is the function 'recognize', which is co-responsible for allocating String instances along with the function 'lex'. I'm sure the code is a bit goofy so, in addition to getting the lifetimes right enough to get this compiling I would also happily accept some guidance on making this idiomatic.
#[deriving(Show)]
enum Token<'a> {
Field(&'a std::string::String),
}
#[deriving(Show)]
struct LexerState<'a> {
character: int,
field: int,
tokens: Vec<Token<'a>>,
str_buf: &'a std::string::String,
}
// The goal with recognize is to:
//
// * gather all A .. z into a temporary string buffer str_buf
// * on ',', move buffer into a Field token
// * store the completely extracted field in LexerState's tokens attribute
//
// I think I'm not understanding how to specify the lifetimes and mutability
// correctly.
fn recognize<'a, 'r>(c: char, ctx: &'r mut LexerState<'a>) -> &'r mut LexerState<'a> {
match c {
'A' ... 'z' => {
ctx.str_buf.push(c);
},
',' => {
ctx.tokens.push(Field(ctx.str_buf));
ctx.field += 1;
ctx.str_buf = &std::string::String::new();
},
_ => ()
};
ctx.character += 1;
ctx
}
fn lex<'a, I, E>(it: &mut I)
-> LexerState<'a> where I: Iterator<Result<char, E>> {
let mut ctx = LexerState { character: 0, field: 0,
tokens: Vec::new(), str_buf: &std::string::String::new() };
for val in *it {
let c:char = val.ok().expect("wtf");
recognize(c, &mut ctx);
}
ctx
}
fn main() {
let tokens = lex(&mut std::io::stdio::stdin().chars());
println!("{}", tokens)
}
In this case, you're constructing new strings rather than borrowing existing strings, so you'd use an owned string directly:
use std::mem;
#[deriving(Show)]
enum Token {
Field(String),
}
#[deriving(Show)]
struct LexerState {
character: int,
field: int,
tokens: Vec<Token>,
str_buf: String,
}
// The goal with recognize is to:
//
// * gather all A .. z into a temporary string buffer str_buf
// * on ',', move buffer into a Field token
// * store the completely extracted field in LexerState's tokens attribute
//
// I think I'm not understanding how to specify the lifetimes and mutability
// correctly.
fn recognize<'a, 'r>(c: char, ctx: &'r mut LexerState) -> &'r mut LexerState {
match c {
'A' ...'z' => { ctx.str_buf.push(c); }
',' => {
ctx.tokens.push(Field(mem::replace(&mut ctx.str_buf,
String::new())));
ctx.field += 1;
}
_ => (),
};
ctx.character += 1;
ctx
}
fn lex<I, E>(it: &mut I) -> LexerState where I: Iterator<Result<char, E>> {
let mut ctx =
LexerState{
character: 0,
field: 0,
tokens: Vec::new(),
str_buf: String::new(),
};
for val in *it {
let c: char = val.ok().expect("wtf");
recognize(c, &mut ctx);
}
ctx
}
fn main() {
let tokens = lex(&mut std::io::stdio::stdin().chars());
println!("{}" , tokens)
}