Exhaustivity checking for 1-to-1 mapping between enums - rust

I'm writing some basic bioinformatics code to transcribe DNA to RNA:
pub enum DnaNucleotide {
A,
C,
G,
T,
}
pub enum RnaNucleotide {
A,
C,
G,
U,
}
fn transcribe(base: &DnaNucleotide) -> RnaNucleotide {
match base {
DnaNucleotide::A => RnaNucleotide::A,
DnaNucleotide::C => RnaNucleotide::C,
DnaNucleotide::G => RnaNucleotide::G,
DnaNucleotide::T => RnaNucleotide::U,
}
}
Is there a way to get the compiler to do an exhaustivity check also on the right side of the match statement, basically ensuring a 1-1 mapping between the two enums?
(A related question: The above is probably better represented with some kind of bijective map, but I don't want to lose the exhaustivity checking. Is there a better way?)

The fact that a one-to-one correspondence exists between two enums suggests that you should really only be using one enum behind the scenes. Here is an example of a data model that I think suits your needs. This is naturally exhaustive because there is only a single enum to begin with.
use core::fmt::{Debug, Error, Formatter};
enum NucleicAcid {
Dna,
Rna,
}
enum Nucleotide {
A,
C,
G,
TU,
}
struct BasePair {
nucleic_acid: NucleicAcid,
nucleotide: Nucleotide,
}
impl BasePair {
fn new(nucleic_acid: NucleicAcid, nucleotide: Nucleotide) -> Self {
Self {
nucleic_acid,
nucleotide,
}
}
}
impl Debug for BasePair {
fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error> {
use NucleicAcid::*;
use Nucleotide::*;
let BasePair {
nucleic_acid,
nucleotide,
} = self;
let nucleic_acid_str = match nucleic_acid {
Dna => "dna",
Rna => "rna",
};
let nucleotide_str = match nucleotide {
A => "A",
C => "C",
G => "G",
TU => match nucleic_acid {
Dna => "T",
Rna => "U",
},
};
f.write_fmt(format_args!("{}:{}", nucleic_acid_str, nucleotide_str))
}
}
fn main() {
let bp1 = BasePair::new(NucleicAcid::Dna, Nucleotide::TU);
let bp2 = BasePair::new(NucleicAcid::Rna, Nucleotide::C);
println!("{:?}, {:?}", bp1, bp2);
// dna:T, rna:C
}

Related

How to access nested enums without full match syntax

I am using the Serde crate to deserialise a JSON file, which has a nested structure like this:
struct Nested {
a: Vec<Foo>,
b: u8,
}
struct Foo {
c: Bar,
d: Vec<f32>,
}
Struct Bar {
e: u32,
f: String,
}
Part of the applications purpose is to check for missing parameters (or incorrect types in parameters), and then display a nicely printed list of errors found in the file, so I need to handle the structure missing parameters or wrongly typed.
I came across this great post that helped solved my issue, by wrapping each parameter in an enum result that contains the value if it passed, the value if it failed, or a final enum if it was missing (since the nested structures might also be missing I wrapped them in the same enum):
pub enum TryParse<T> {
Parsed(T),
Unparsed(Value),
NotPresent
}
struct Nested {
a: TryParse<Vec<Foo>>,
b: TryParse<u8>,
}
struct Foo {
c: TryParse<Bar>,
d: TryParse<Vec<f32>>,
}
Struct Bar {
e: TryParse<u32>,
f: TryParse<String>,
}
However, I'm not sure how to access them now without unpacking every step into a match statement. For example, I can access B very easily:
match file.b {
Parsed(val) => {println!("Got parameter of {}", val)},
Unparsed(val) => {println!("Invalid type: {:?}", val)}
NotPresent => {println!("b not found")},
};
However, I'm not sure how to access the nested ones (C D E and F). I can't use Unwrap or expect since this isn't technically a 'Result', so how do I unpack these?:
if file.a.c.e.Parsed() && file.a.c.e == 32 {... //invalid
if file.a.d && file.a.d.len() == 6... //invalid
I know in a way this flies against rust's 'handle every outcome' philosophy, and I want to handle them, but I want to know if there is a nicer way than 400 nested match statements (the above example is very simplified, the files I am using have up to 6 nested layers, each parameter in the top node has at least 3 layers, some are vectors as well)…
Perhaps I need to implement a function similar to unwrap() on my 'TryParse'? or would it be better to wrap each parameter in a 'Result', extend that with the deserialise trait, and then somehow store the error in the Err option that says if it was a type error or missing parameter?
EDIT
I tried adding the following, some of which works and some of which does not:
impl <T> TryParse<T> {
pub fn is_ok(self) -> bool { //works
match self {
Self::Parsed(_t) => true,
_ => false,
}
}
pub fn is_absent(self) -> bool { //works
match self {
Self::NotPresent => true,
_ => false,
}
}
pub fn is_invalid(self) -> bool { //works
match self {
Self::Unparsed(_) => true,
_ => false,
}
}
pub fn get(self) -> Result<T, dyn Error> { //doesnt work
match self {
Self::Parsed(t) => Ok(t),
Self::Unparsed(e) => Err(e),
Self::NotPresent => Err("Invalid")
}
}
}
I can't believe it is this hard just to get the result, should I just avoid nested enums or get rid of the TryParse enums/ functions all together and wrap everything in a result, so the user simply knows if it worked or didn't work (but no explanation why it failed)
Implementing unwrap() is one possibility. Using Result is another, with a custom error type. You can deserialize directly into result with #[serde(deserialize_with = "...")], or using a newtype wrapper.
However, a not-enough-used power of pattern matching is nested patterns. For example, instead of if file.a.c.e.Parsed() && file.a.c.e == 32 you can write:
if let TryParse::Parsed(a) = &file.a {
// Unfortunately we cannot combine this `if let` with the surrounding `if let`,
// because `Vec` doesn't support pattern matching (currently).
if let TryParsed::Parsed(
[Foo {
c:
TryParse::Parsed(Bar {
e: TryParse::Parsed(32),
..
}),
..
}, ..],
) = a.as_slice()
{
// ...
}
}
May not be the most Rust-y way of doing it, but for those like me moving from another language like C/Python/C++, this is the way I have done it that still allows me to quickly validate if I have an error and use the match syntax to handle it. Thanks to #Chayim Friedman for assisting with this, his way is probably better but this made the most sense for me:
#[derive(Debug)]
pub enum TryParse<T> {
Parsed(T),
Unparsed(Value),
NotPresent
}
impl<'de, T: DeserializeOwned> Deserialize<'de> for TryParse<T> {
fn deserialize<D: Deserializer<'de>>(deserializer: D) -> Result<Self, D::Error> {
match Option::<Value>::deserialize(deserializer)? {
None => Ok(TryParse::NotPresent),
Some(value) => match T::deserialize(&value) {
Ok(t) => Ok(TryParse::Parsed(t)),
Err(_) => Ok(TryParse::Unparsed(value)),
},
}
}
}
impl <T> TryParse<T> {
//pub fn is_ok(self) -> bool { ---> Use get().is_ok(), built into result
// match self {
// Self::Parsed(_t) => true,
// _ => false,
// }
//}
pub fn is_absent(self) -> bool {
match self {
Self::NotPresent => true,
_ => false,
}
}
pub fn is_invalid(self) -> bool {
match self {
Self::Unparsed(_) => true,
_ => false,
}
}
pub fn get(&self) -> Result<&T, String> {
match self {
Self::Parsed(t) => Ok(t),
Self::Unparsed(v) => Err(format!("Unable to Parse {:?}", v)),
Self::NotPresent => Err("Parameter not Present".to_string())
}
}
// pub fn get_direct(&self) -> &T {
// match self {
// Self::Parsed(t) => t,
// _ => panic!("Can't get this value!"),
// }
// }
}
match &nested.a.get().unwrap()[1].c.get.expect("Missing C Parameter").e{
Parsed(val) => {println!("Got value of E: {}", val)},
Unparsed(val) => {println!("Invalid Type: {:?}", val)}
NotPresent => {println!("Param E Not Found")},
};
//Note the use of '&' at the beginning because we need to borrow a reference to it
I know I need to change my mindset to use the rust way of thinking, and I am completely open to other suggestions if they can demonstrate some working code.

Function that generates a HashMap of Enum variants

I'm working with apollo_parser to parse a GraphQL query. It defines an enum, apollo_parser::ast::Definition, that has several variants including apollo_parser::ast::OperationDefintion and apollo_parser::ast::FragmentDefinition. I'd like to have a single Trait I can apply to apollo_parser::ast::Definition that provides a function definition_map that returns a HashMap mapping the operation name to the variant instance.
I've got as far as the trait, but I don't know how to implement it. Also, I don't know how to constrain T to be a variant of Definition.
trait Mappable {
fn definition_map<T>(&self) -> HashMap<String, T>;
}
EDIT:
Here's a Rust-ish pseudocode implementation.
impl Mappable for Document {
fn definition_map<T>(&self) -> HashMap<String, T> {
let defs = Vec<T> = self.definitions
.filter_map(|def: Definition| match def {
T(foo) => Some(foo),
_ => None
}).collect();
let map = HashMap::new();
for def: T in definitions {
map.insert(def.name(), def);
}
map
}
}
and it would output
// From a document consisting of OperationDefinitions "operation1" and "operation2"
// and FragmentDefinitons "fragment1" and "fragment2"
{
"operation1": OperationDefinition(...),
"operation2": OperationDefinition(...),
}
{
"fragment1": FragmentDefinition(...),
"fragment2": FragmentDefinition(...)
}
I don't know how to constrain T to be a variant of Definition.
There is no such thing in Rust. There's the name of the variant and the name of the type contained within that variant, there is no relationship between the two. The variants can be named whatever they want, and multiple variant can contain the same type. So there's no shorthand for pulling a T out of an enum which has a variant with a T.
You need to make your own trait that says how to get a T from a Definition:
trait TryFromDefinition {
fn try_from_def(definition: Definition) -> Option<Self> where Self: Sized;
fn name(&self) -> String;
}
And using that, your implementation is simple:
impl Mappable for Document {
fn definition_map<T: TryFromDefinition>(&self) -> HashMap<String, T> {
self.definitions()
.filter_map(T::try_from_def)
.map(|t| (t.name(), t))
.collect()
}
}
You just have to define TryFromDefinition for all the types you want to use:
impl TryFromDefinition for OperationDefinition {
fn try_from_def(definition: Definition) -> Option<Self> {
match definition {
Definition::OperationDefinition(operation) => Some(operation),
_ => None,
}
}
fn name(&self) -> String {
self.name().unwrap().ident_token().unwrap().text().into()
}
}
impl TryFromDefinition for FragmentDefinition {
fn try_from_def(definition: Definition) -> Option<Self> {
match definition {
Definition::FragmentDefinition(operation) => Some(operation),
_ => None,
}
}
fn name(&self) -> String {
self.fragment_name().unwrap().name().unwrap().ident_token().unwrap().text().into()
}
}
...
Some of this could probably be condensed using macros, but there's no normalized way that I can tell to get a name from a definition, so that would still have to be custom per type.
You should also decide how you want to handle definitions that don't have a name; you'd probably want to return Option<String> to avoid all those .unwrap()s, but I don't know how you'd want to put that in your HashMap.
Without knowing your whole workflow, I might suggest a different route instead:
struct Definitions {
operations: HashMap<String, OperationDefinition>,
fragments: HashMap<String, FragmentDefinition>,
...
}
impl Definitions {
fn from_document(document: &Document) -> Self {
let mut operations = HashMap::new();
let mut fragments = HashMap::new();
...
for definition in document.definitions() {
match definition {
Definition::OperationDefinition(operation) => {
let name: String = operation.name().unwrap().ident_token().unwrap().text().into();
operations.insert(name, operation);
},
Definition::FragmentDefinition(fragment) => {
let name: String = fragment.fragment_name().unwrap().name().unwrap().ident_token().unwrap().text().into();
fragments.insert(name, fragment);
},
...
}
}
Definitions {
operations,
fragments,
...
}
}
}

How can I deserialize a fieldless enum from either a string or number?

Is there a succinct way to deserialize a variant of a fieldless enum from either its name or discriminant value? e.g. given this enum—
enum Foo {
A = 1,
B = 2,
C = 3,
}
—I’d like any of these strings or numbers to represent it:
{
"example-a": "a", // Foo::A
"example-b": "b", // Foo::B
"example-c": "c", // Foo::C
"example-1": 1, // Foo::A
"example-2": 2, // Foo::B
"example-3": 3, // Foo::C
}
I’ve seen that deriving Deserialize accommodates the former group and Deserialize_repr the latter, but I’m not sure how to accommodate both simultaneously.
I expected that a shorthand like #[serde(alias = …)] might exist to cover this scenario.
There is not a built-in shortcut that supports this directly. You will have to implement Deserialize manually. It's neither trivial nor super complicated:
impl<'de> serde::Deserialize<'de> for Foo {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: serde::Deserializer<'de>
{
struct FooVisitor;
impl<'de> serde::de::Visitor<'de> for FooVisitor {
type Value = Foo;
fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
write!(formatter, "an integer or string representing a Foo")
}
fn visit_str<E: serde::de::Error>(self, s: &str) -> Result<Foo, E> {
Ok(match s {
"a" => Foo::A,
"b" => Foo::B,
"c" => Foo::C,
_ => return Err(E::invalid_value(serde::de::Unexpected::Str(s), &self)),
})
}
fn visit_u64<E: serde::de::Error>(self, n: u64) -> Result<Foo, E> {
Ok(match n {
1 => Foo::A,
2 => Foo::B,
3 => Foo::C,
_ => return Err(E::invalid_value(serde::de::Unexpected::Unsigned(n), &self)),
})
}
}
deserializer.deserialize_any(FooVisitor)
}
}
Note that using deserialize_any means we are relying on the data format being self-describing; i.e., that the deserializer knows whether the data is stringy or integerish and will call the correct visit_ method accordingly. Serde also supports non-self-describing formats; however, you won't be able to use them with this Deserialize implementation.

How to map a parametrized enum from a generic type to another?

If I have a type like MyEnum<T>, how can I map it in cases where not every variant is parameterized?
For example, I'd like to convert from MyEnum<u32> to MyEnum<String>:
enum MyEnum<T> {
B,
C,
D(T),
}
fn trans(a: MyEnum<u32>) -> MyEnum<String> {
match a {
MyEnum::D(i) => MyEnum::D(i.to_string()),
other_cases => other_cases,
}
}
fn main() {}
This fails with:
error[E0308]: match arms have incompatible types
--> src/main.rs:8:9
|
8 | match a {
| ^ expected struct `std::string::String`, found u32
|
= note: expected type `MyEnum<std::string::String>`
= note: found type `MyEnum<u32>`
note: match arm with an incompatible type
--> src/main.rs:10:28
|
10 | other_cases => other_cases,
| ^^^^^^^^^^^
Instead of the other_cases => other_cases line, I tried this, also without success:
other_cases => {
let o: MyEnum<String> = other_cases;
o
}
I'd create a map method on your enum:
#[derive(Debug)]
enum MyEnum<T> {
B,
C,
D(T),
}
impl<T> MyEnum<T> {
fn map<U>(self, f: impl FnOnce(T) -> U) -> MyEnum<U> {
use MyEnum::*;
match self {
B => B,
C => C,
D(x) => D(f(x)),
}
}
}
fn main() {
let answer = MyEnum::D(42);
let answer2 = answer.map(|x| x.to_string());
println!("{:?}", answer2);
}
This is similar to existing map methods, such as Option::map.
Well, this is actually an answer:
enum MyEnum<T> {
B,
C,
D(T),
}
fn trans(a: MyEnum<u32>) -> MyEnum<String> {
match a {
MyEnum::D(i) => MyEnum::D(i.to_string()),
MyEnum::B => MyEnum::B,
MyEnum::C => MyEnum::C
}
}
fn main() {
}
But repeating all variants isn't acceptable when there's a lot of them..
Some languages (like C++), use Duck Typing: if it quacks like a duck, it must be a duck, and therefore names matter. Rust does not.
In Rust, names are just some display utility for us mere humans, the B in MyEnum<u32> and MyEnum<String> may happen to have the same visual representation, but they are completely different syntactic entities as far as the language is concerned.
There are multiple ways to alleviate your pain, though:
a code generation plugin or build.rs script can be used as well
a macro can be used to automate the mapping
a manual mapping can be done, it's a one shot effort after all
the code can be restructured to separate type-dependent from type-independent variants
I'll show-case the latter:
enum MyEnumImpl {
A,
B,
C,
}
enum MyEnum<T> {
Independent(MyEnumImpl),
Dependent(T),
}
Obviously, the latter makes it much easier to manually map things.
macro_rules! partial_enum {
($name: ident, $some: ident, $($none: ident),+) => {
#[derive(Debug)]
enum $name<T> {
$some(T),
$($none),+
}
impl<T> $name<T> {
fn convert<U>(self) -> Result<$name<U>, T> {
match self {
$name::$some(x) => Err(x),
$($name::$none => Ok($name::$none)),+
}
}
}
}
}
partial_enum!(MyEnum, D, B, C);
fn trans(a: MyEnum<u32>) -> MyEnum<String> {
let a_split: Result<MyEnum<String>, u32> = a.convert();
match a_split {
Ok(is_none) => is_none,
Err(not_none) => MyEnum::D(not_none.to_string()),
}
}
fn main() {
println!("{:?}", trans(MyEnum::D(13)));
}

Is there an easy way to cast entire tuples of scalar values at once?

I want to cast a (u16, u16) to a (f32, f32). This is what I tried:
let tuple1 = (5u16, 8u16);
let tuple2 = tuple1 as (f32, f32);
Ideally, I would like to avoid writing
let tuple2 = (tuple1.0 as f32, tuple1.1 as f32);
There's no built-in way to do this, but one can do it with a macro:
macro_rules! tuple_as {
($t: expr, ($($ty: ident),*)) => {
{
let ($($ty,)*) = $t;
($($ty as $ty,)*)
}
}
}
fn main() {
let t: (u8, char, isize) = (97, 'a', -1);
let other = tuple_as!(t, (char, i32, i8));
println!("{:?}", other);
}
Prints ('a', 97, -1).
The macro only works for casting between types with names that are a single identifier (that's what the : ident refers to), since it reuses those names for binding to the elements of the source tuple to be able to cast them. All primitive types are valid single identifiers, so it works well for those.
No, you cannot. This is roughly equivalent to "can I cast all the fields in a struct to different types all at once?".
You can write a generic extension trait which can do this conversion for you, the only problem is that I don't believe there's any existing generic "conversion" trait which also has a u16 -> f32 implementation defined.
If you really want a function that does this, here is an as-minimal-as-I-could-make-it skeleton you can build on:
trait TupleCast<T> {
type Output;
fn tuple_cast(self) -> <Self as TupleCast<T>>::Output;
}
impl<T> TupleCast<T> for () {
type Output = ();
fn tuple_cast(self) -> <() as TupleCast<T>>::Output {
()
}
}
impl<S, T> TupleCast<T> for (S,) where S: CustomAs<T> {
type Output = (T,);
fn tuple_cast(self) -> <(S,) as TupleCast<T>>::Output {
(self.0.custom_as(),)
}
}
impl<S, T> TupleCast<T> for (S, S) where S: CustomAs<T> {
type Output = (T, T);
fn tuple_cast(self) -> <(S, S) as TupleCast<T>>::Output {
(self.0.custom_as(), self.1.custom_as())
}
}
// You would probably have more impls, up to some size limit.
// We can't use std::convert::From, because it isn't defined for the same
// basic types as the `as` operator is... which kinda sucks. So, we have
// to implement the desired conversions ourselves.
//
// Since this would be hideously tedious, we can use a macro to speed things
// up a little.
trait CustomAs<T> {
fn custom_as(self) -> T;
}
macro_rules! custom_as_impl {
($src:ty:) => {};
($src:ty: $dst:ty) => {
impl CustomAs<$dst> for $src {
fn custom_as(self) -> $dst {
self as $dst
}
}
};
($src:ty: $dst:ty, $($rest:ty),*) => {
custom_as_impl! { $src: $dst }
custom_as_impl! { $src: $($rest),* }
};
}
// You could obviously list others, or do manual impls.
custom_as_impl! { u16: u16, u32, u64, i32, i64, f32, f64 }
fn main() {
let x: (u16, u16) = (1, 2);
let y: (f32, f32) = x.tuple_cast();
println!("{:?}", y);
}
No,
there
is
not.
this version handles a few more cases Playground Example
original source: https://stackoverflow.com/a/29981602/5979634
because of matching rules, for single type casts just use as_tuple!(expr, T) or as_tuple!(expr, (T))
the rest works as in the original answer
macro_rules! tuple_as {
($t: expr, $ty: ident) => {{
let (a, b) = $t;
let a = a as $ty;
let b = b as $ty;
(a, b)
}};
($t: expr, ($ty: ident)) => {{
let (a, b) = $t;
let a = a as $ty;
let b = b as $ty;
(a, b)
}};
($t: expr, ($($ty: ident),*)) => {{
let ($($ty,)*) = $t;
($($ty as $ty,)*)
}}}

Resources