Set of structs in Go - struct

If I have a number of structs that I want to store:
type Stuff struct {
a string
b string
}
I can do it with a slice, but it seems like it would use less memory to use a proper set structure.
Unfortunately Go doesn't have a set structure. Everyone recommends using map[Stuff]struct{} but that doesn't work because Stuff is a struct. Anyone have any good solutions? Ideally without having to download a library.

Usually set and map data structures require more memory than storing a list of values in plain array or slice as set and map provide additional features efficiently like uniqueness or value retrieval by key.
If you want minimal memory usage, simply store them in a slice, e.g. []Stuff. If you use the values in multiple places, it might also be profitable to just store their pointers, e.g. []*Stuff and so each places that store the same Stuff values can store the same pointer (without duplicating the value).
If you only want to store unique struct values, then indeed the set would be the most convenient choice, in Go realized with a map.
There's nothing wrong with map[Stuff]struct{}, it works. The requirement for the key type for maps:
The comparison operators == and != must be fully defined for operands of the key type; thus the key type must not be a function, map, or slice.
Stuff is a struct, and structs in Go are comparable if:
Struct values are comparable if all their fields are comparable. Two struct values are equal if their corresponding non-blank fields are equal.
If your Stuff struct is what you posted, it is comparable: it only contains fields of the comparable type string.
Also note that if you want a set data structure, it's clearer if you use bool as the value type (e.g. map[Stuff]bool) and true as the value, and then you can simply use indexing to test if a value is in the map as the index expression yields the zero value of the value type (false for bool) if the key (Stuff in your case) is not in the map, properly telling the value you're looking for is not in the "set". (And if it is in the map, its associated true value is the result of the index expression - properly telling it is in the map).

As icza said, a map of structs is a valid option.
But instead of implementing the Set yourself, I would use one of the new generic implementations that are out there since Go 1.18.
See this one for example: https://github.com/amit7itz/goset
package main
import (
"fmt"
"github.com/amit7itz/goset"
)
type Stuff struct {
a string
b string
}
func main() {
set := goset.NewSet[Stuff]()
set.Add(Stuff{a: "1", b: "2"})
set.Add(Stuff{a: "2", b: "3"})
set.Add(Stuff{a: "2", b: "3"})
fmt.Println(set) // Set[main.Stuff]{{2 3} {1 2}}
fmt.Println(set.Len()) // 2
}

Related

How to define an ordered Map/Set with a runtime-defined comparator?

This is similar to How do I use a custom comparator function with BTreeSet? however in my case I won't know the sorting criteria until runtime. The possible criteria are extensive and can't be hard-coded (think something like sort by distance to target or sort by specific bytes in a payload or combination thereof). The sorting criteria won't change after the map/set is created.
The only alternatives I see are:
use a Vec, but log(n) inserts and deletes are crucial
wrap each of the elements with the sorting criteria (directly or indirectly), but that seems wasteful
This is possible with standard C++ containers std::map/std::set but doesn't seem possible with Rust's BTreeMap/BTreeSet. Is there an alternative in the standard library or in another crate that can do this? Or will I have to implement this myself?
My use-case is a database-like system where elements in the set are defined by a schema, like:
Element {
FIELD x: f32
FIELD y: f32
FIELD z: i64
ORDERBY z
}
But since the schema is user-defined at runtime, the elements are stored in a set of bytes (BTreeSet<Vec<u8>>). Likewise the order of the elements is user-defined. So the comparator I would give to BTreeSet would look like |a, b| schema.cmp(a, b). Hard-coded, the above example may look something like:
fn cmp(a: &Vec<u8>, b: &Vec<u8>) -> Ordering {
let a_field = self.get_field(a, 2).as_i64();
let b_field = self.get_field(b, 2).as_i64();
a_field.cmp(b_field)
}
Would it be possible to pass the comparator closure as an argument to each node operation that needs it? It would be owned by the tree wrapper instead of cloned in every node.

How do I create a constant enum variant with heap-allocated data to use in pattern matching?

I have this enum:
enum Foo {
Bar,
Baz(String),
// And maybe some other variants
}
I find myself constantly pattern matching against predefined strings of Baz.
match foo_instance {
Foo::Baz(ref x) => match x.as_str() {
"stuff" => ...,
"anotherstuff" => ...,
},
// Other match cases...
}
Sometimes I only match for one case, sometimes 4-5 cases of Foo::Baz. In latter situation, a double match doesn't bother me so much, in fact the grouping makes sense at that point. If I only match against one case of Foo::Baz it doesn't feel right. What I really want to be able to do is this:
const STUFF: Foo = Foo::Baz("stuff".to_string());
const ANOTHER_STUFF: Foo = Foo::Baz("anotherstuff".to_string());
match foo_instance {
&STUFF => ...,
&ANOTHER_STUFF => ...,
// Other match cases...
}
But of course, because of to_string() calls, this won't work (and I also need to derive Eq trait to be able to match against consts which is odd. That may also be a problem for me.). Is there any way to mimic this? For example, using a macro, can I do something like that:
const STUFF: Foo = magic!(Foo::Baz("stuff".to_string());
const ANOTHER_STUFF: Foo = magic!(Foo::Baz("anotherstuff".to_string()));
// Or any other thing that can mimic top level behavior,
// I may create these constants dynamically at program start, that would work too.
match foo_instance {
another_magic!(STUFF) => ...,
another_magic!(ANOTHER_STUFF) => ...,
// Other match cases...
}
Generally speaking, I want to be able to have some constant variants of an enum that contains an heap allocated data (String in this case), so that I can reuse them whenever I need in a match case. What's the best way to deal with this?
You cannot create constants with heap-allocated data in current Rust, full stop.
Maybe in the future it will be possible to create a const FOO: String, but there's a lot of work to be done and decisions to be made before that.
using a macro
Macros are not magic. They only let you write certain types of code that you can already write but with a new syntax. Since you cannot create these constants, you cannot write a macro to create them either.
I want to somehow extract all the left hand side of the match case to a constant or something like that. Match guards doesn't help me with that(Yes, it eliminates the need for double match, but its a last resort for me.
Also map()ing my enum is not going to work, because it's not a struct, I need to match in map too.
Creating a parallel type seems too weird.
I can create a const static &str and match against that with those solutions but that would raise another problem. These constants will only have strings and they lack the total meaning.
There is only one solution that does what I need along those solutions presented and I found that one too verbose.
I have a lot of enums like this and it will be hard to create parallel types for each of them and apart from that, the solution is just verbose.
For other readers who can use the alternative solutions which you have already discarded, see also:
How do I match a String in a struct with a constant value?
For further reading, see also:
Trying to declare a String const results in expected type, found "my string"
Expected String, found &str when matching an optional string
How to match a String against string literals in Rust?

How can I create specific values for generic numeric types?

I'm working on a library which will provide a trait for axis-aligned bounding boxes (AABB) operations. The trait is declared like this:
trait Aabb {
type Precision : Zero + One + Num + PartialOrd + Copy;
// [...]
}
I don't care which precision the user chooses, as long as these constraints are respected (though I don't really expect integer types to be chosen).
I'm having trouble using literals. Some operations require constant values, as an example:
let extension = 0.1;
aabb.extend(extension);
This doesn't work because Aabb::extend expects Aabb::Precision and not a float. My solution was something like this:
let mut ten = Aabb::Precision::zero();
for _ in 0..10 {
ten = ten + Aabb::Precision::one();
}
aabb_extension = Aabb::Precision::one() / ten;
This works, but I need to resort to this every time I need a specific number and it is getting cumbersome. Is this really the only way?
I need to resort to this every time I need a specific number and it is getting cumbersome. Is this really the only way?
Basically, yes. Unless you can answer the question of "how do you support converting a literal 0 to MyCustomTypeThatImplementsTheTrait".
You can't have it both ways — you can't ask for something to be generic and then use concrete literals.
You can have different workarounds. Providing base values like "zero" and "one", or having a "convert a specific type to yourself" method, for example.
You could also re-evaluate what you are attempting to do; perhaps you are thinking at too low a level. Indeed, what does it mean to "extend by 0.1" a type that represents points as floating point values between 0 and 1?
Maybe it would be better to have an expand_by_percentage method instead, or something else that makes sense in the domain.
See also:
How do I use integer number literals when using generic types?
Cannot create a generic function that uses a literal zero
Dividing a const by a generic in Rust
How can I create an is_prime function that is generic over various integer types?
In this case, I would recommend that you create your own trait and provide default implementations of the methods.
For example, I would naively imagine:
trait ApproximateValue: Zero + One {
fn approximate(val: f64) -> ApproximateValue {
// some algorithm to create "val" from Zero and One
}
}
then, your Precision associated type will have a bound of ApproximateValue and you will just call Precision::approximate(0.1).

Set of structs containing a slice

I'm trying to implement a toy search algorithm and need to maintain a set of explored states. A state is a struct:
type VWState struct {
botLocation VWCoords
dirtLocations []VWCoords
}
My first thought was that a simple Set could be implemented using a map[VWState]bool, but I can't seem to figure out a way to make it work. If I try to use a VWState as a key to a map, I get the following panic:
Panic: runtime error: hash of unhashable type vw.VWState (PC=0x40EB0D)
Is there a way to make this work? Can I implement a custom hashing function for the struct, or should I be looking at some other ways to implement this?
Any help would be greatly appreciated.
You can use a pointer to your struct as a map key:
map[*VWState]bool
If you want to be able to compare equivalent structs, you can create a method to output a key for the map. String() would be convenient, since you could also use it to print your struct, or tie in a hash function and output something shorter, even an int.
Something as simple as this may suffice, though you could make the output shorter if you like (being careful not to recursively call String() in your format line):
func (s VWState) String() string {
return fmt.Sprintf("%#v", s)
}
func main() {
m := make(map[string]bool)
s := VWState{}
m[s.String()] = true
}
If there is a sensible maximum length for dirtLocations then you could use an array instead of a slice. Arrays are hashable (provided the element is hashable).
type VWState struct {
botLocation VWCoords
dirtLocations [4]VWCoords
}
You'll then need to either add a count of the number of valid dirtLocations or detect the zero value of VWCoords to work out how many slots in dirtLocations are valid.

Is Option<T> optimized to a single byte when T allows it?

Suppose we have an enum Foo { A, B, C }.
Is an Option<Foo> optimized to a single byte in this case?
Bonus question: if so, what are the limits of the optimization process? Enums can be nested and contain other types. Is the compiler always capable of calculating the maximum number of combinations and then choosing the smallest representation?
The compiler is not very smart when it comes to optimizing the layout of enums for space. Given:
enum Option<T> { None, Some(T) }
enum Weird<T> { Nil, NotNil { x: int, y: T } }
enum Foo { A, B, C }
There's really only one case the compiler considers:
An Option-like enum: one variant carrying no data ("nullary"), one variant containing exactly one datum. When used with a pointer known to never be null (currently, only references and Box<T>) the representation will be that of a single pointer, null indicating the nullary variant. As a special case, Weird will receive the same treatment, but the value of the y field will be used to determine which variant the value represents.
Beyond this, there are many, many possible optimizations available, but the compiler doesn't do them yet. In particular, your case will not berepresented as a single byte. For a single enum, not considering the nested case, it will be represented as the smallest integer it can.

Resources