Multiple parameter lists in Rust - rust

I've just started learning Rust, and I wonder how best to translate the pattern of multiple parameter lists.
In Scala, I can define functions taking multiple parameter lists as follows:
def add(n1: Int)(n2: Int) = n1 + n2
This can be used, for example, for function specialisation:
val incrementer = add(1)
val three = incrementer(2)
val four = incrementer(three)
One of my favourite uses of this feature is incrementally constructing immutable data structures. For example, where the initial caller might not know all of the required fields, so they can fill some of them, get back a closure taking the rest of the fields, and then pass that along for someone else to fill in the rest, completing construction.
I tried to implement this partial construction pattern in Rust:
#[derive(Debug)]
#[non_exhaustive]
pub struct Name<'a> {
pub first: &'a str,
pub last: &'a str,
}
impl<'a> Name<'a> {
pub fn new(first: &'a str, last: &'a str) -> Result<Self, &'static str> {
if first.len() > 0 && last.len() > 0 {
return Ok(Self { first, last });
}
return Err("first and last must not be empty");
}
pub fn first(first: &'a str) -> impl Fn(&'a str) -> (Result<Name, &'a str>) {
|last| Name::new(first, last)
}
}
But it's extremely verbose, and it seems like there should be a much easier way (imagine there were 5 fields, I'd have to write 5 functions).
In essence I would like something like this (pseudo-Rust):
pub fn first(first: &'a str)(last: &'a str) -> Result<Name, &'static str> {
Name::new(first, last)
}
let takes_last = first("John")
let name = takes_last("Smith").unwrap()
What is the best way to have this pattern in Rust?

As Chayim said in a comment you did the currying part the best way possible, of course in Rust you'd usually just define the function taking all parameters, if you want to partially apply a function just use a closure at the call site:
fn add(a: i32, b: i32) -> i32 {
a + b
}
let incrementer = |b| add(1, b);
let three = incrementer(2)
let four = incrementer(three)

One of my favourite uses of this feature is incrementally constructing immutable data structures.
Very well. Rust's ownership rules and type system can be quite awesome for this. Consider your toy example. Here would be one way to implement:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=5f823b81c3b2ca2d01e1ffb0d23aff72
struct NameBuilder {
first: Option<String>,
last: Option<String>,
}
struct Name {
first: String,
last: String,
}
impl NameBuilder {
fn new() -> Self {
NameBuilder {
first: None,
last: None,
}
}
fn with_first(mut self, first: String) -> Result<Self, &'static str> {
if first.len() == 0 {
Err("First name cannot be empty")
} else {
self.first = Some(first);
Ok(self)
}
}
fn with_last(mut self, last: String) -> Result<NameBuilder, &'static str> {
if last.len() == 0 {
Err("Last name cannot be empty")
} else {
self.last = Some(last);
Ok(self)
}
}
fn to_name(self) -> Result<Name, &'static str> {
Ok(Name {
first: self.first.ok_or("Must provide a first name!")?,
last: self.last.ok_or("Must provide a last name!")?,
})
}
}
fn main() -> Result<(), &'static str> {
let name_builder = NameBuilder::new();
assert!(name_builder.to_name().is_err());
let name_builder = NameBuilder::new()
.with_first("Homer".to_string())?
.with_last("Simpson".to_string())?;
let name = name_builder.to_name()?;
assert_eq!(name.first, "Homer");
assert_eq!(name.last, "Simpson");
Ok(())
}
This is obviously overkill for what you're doing in your example but can work really nice in situations where there are lots of parameters but where for any given concrete use case you'd only explicitly set a few of them and use default values for the rest.
An added benefit is that you can freely choose the order in which you build it.
In my example I opted for String rather than &'a str mostly so I don't have to type so many awkward &''s :p
NOTE: Even though the with_first method takes in mut self as argument, we still are dealing with an essentially immutable data structure, because we're just consuming self (i.e. taking ownership). Basically, there's no way that someone would hold a reference and then be surprised by us setting the first name to something else, because you can't move self while someone is still borrowing it.
This is of course not the only way to make a fluent interface. You could also think of a purely functional approach where data is immutable and you don't consume self. Then we're entering "persistent data structures" territory, e.g. https://github.com/orium/rpds

Related

Assembling a string and returning it with lifetime parameters for a l-system

I'm trying to implement a L-System struct and am struggling with it. I already tried different approaches but my main struggle comes from lifetime of references. What I'm trying to achieve is passing the value of the applied axioms back to my system variable, which i passed with the necessary lifetime in apply_axioms_once.
use std::collections::HashMap;
struct LSytem<'a> {
axioms: HashMap<&'a char, &'a str>,
}
impl<'a> LSytem<'a> {
fn apply_axioms_once(&mut self, system: &'a mut str) -> &'a str {
let mut applied: String = String::new();
for c in system.chars() {
let axiom = self.axioms.get(&c).unwrap();
for s in axiom.chars() {
applied.push(s);
}
}
system = applied.as_str();
system
}
fn apply_axioms(&mut self, system: &'a str, iterations: u8) -> &'a str {
let mut applied: &str = system;
// check for 0?
for _ in 0..iterations {
applied = self.apply_axioms_once(applied);
}
&applied
}
}
I already read a couple of similar questions, but still can't quite wrap my head around it. What seems to be the most on point answer is https://stackoverflow.com/a/42506211/18422275, but I'm still puzzled about how to apply this to my issue.
I am still a beginner in rust, and way more bloody than i thought.
This can't work because you return a reference of a data created inside the function (so the given data has a lifetime until the end of the function scope, the returned reference would point to nothing).
You shoud try to return String from your functions instead, so the returned data can be owned.
I made this example to try out:
use std::collections::HashMap;
struct LSytem<'a> {
axioms: HashMap<&'a char, &'a str>,
}
impl<'a> LSytem<'a> {
fn apply_axioms_once(&mut self, system: &String) -> String {
let mut applied: String = String::new();
for c in system.chars() {
let axiom = self.axioms.get(&c).unwrap();
for s in axiom.chars() {
applied.push(s);
}
}
applied
}
fn apply_axioms(&mut self, system: &String, iterations: u8) ->String{
let mut applied = String::from(system);
// check for 0?
for _ in 0..iterations {
applied = self.apply_axioms_once(system);
}
applied
}
}
fn main() {
let mut ls = LSytem {axioms: HashMap::new()};
ls.axioms.insert(&'a', "abc");
let s = String::from("a");
ls.apply_axioms(&s,1);
}

How do I create mutable iterator over struct fields

So I am working on a little NES emulator using Rust and I am trying to be fancy with my status register. The register is a struct that holds some fields (flags) that contain a bool, the register itself is part of a CPU struct. Now, I want to loop through these fields and set the bool values based on some instruction I execute. However, am not able to implement a mutable iterator, I've implemented an into_iter() function and are able to iterate through the fields to get/print a bool value but how do I mutate these values within the struct itself? Is this even possible?
pub struct StatusRegister {
CarryFlag: bool,
ZeroFlag: bool,
OverflowFlag: bool,
}
impl StatusRegister {
fn new() -> Self {
StatusRegister {
CarryFlag: true,
ZeroFlag: false,
OverflowFlag: true,
}
}
}
impl<'a> IntoIterator for &'a StatusRegister {
type Item = bool;
type IntoIter = StatusRegisterIterator<'a>;
fn into_iter(self) -> Self::IntoIter {
StatusRegisterIterator {
status: self,
index: 0,
}
}
}
pub struct StatusRegisterIterator<'a> {
status: &'a StatusRegister,
index: usize,
}
impl<'a> Iterator for StatusRegisterIterator<'a> {
type Item = bool;
fn next(&mut self) -> Option<bool> {
let result = match self.index {
0 => self.status.CarryFlag,
1 => self.status.ZeroFlag,
2 => self.status.OverflowFlag,
_ => return None,
};
self.index += 1;
Some(result)
}
}
pub struct CPU {
pub memory: [u8; 0xffff],
pub status: StatusRegister,
}
impl CPU {
pub fn new() -> CPU {
let memory = [0; 0xFFFF];
CPU {
memory,
status: StatusRegister::new(),
}
}
fn execute(&mut self) {
let mut shifter = 0b1000_0000;
for status in self.status.into_iter() {
//mute status here!
println!("{}", status);
shifter <<= 1;
}
}
}
fn main() {
let mut cpu = CPU::new();
cpu.execute();
}
Implementing an iterator over mutable references is hard in general. It becomes unsound if the iterator ever returns references to the same element twice. That means that if you want to write one in purely safe code, you have to somehow convince the compiler that each element is only visited once. That rules out simply using an index: you could always forget to increment the index or set it somewhere and the compiler wouldn't be able to reason about it.
One possible way around is chaining together several std::iter::onces (one for each reference you want to iterate over).
For example,
impl StatusRegister {
fn iter_mut(&mut self) -> impl Iterator<Item = &mut bool> {
use std::iter::once;
once(&mut self.CarryFlag)
.chain(once(&mut self.ZeroFlag))
.chain(once(&mut self.OverflowFlag))
}
}
(playground)
Upsides:
Fairly simple to implement.
No allocations.
No external dependencies.
Downsides:
The iterator has a very complicated type: std::iter::Chain<std::iter::Chain<std::iter::Once<&mut bool>, std::iter::Once<&mut bool>>, std::iter::Once<&mut bool>>.
So you if don't want to use impl Iterator<Item = &mut bool>, you'll have to have that in your code. That includes implementing IntoIterator for &mut StatusRegister, since you'd have to explicitly indicate what the IntoIter type is.
Another approach is using an array or Vec to hold all the mutable references (with the correct lifetime) and then delegate to its iterator implementation to get the values. For example,
impl StatusRegister {
fn iter_mut(&mut self) -> std::vec::IntoIter<&mut bool> {
vec![
&mut self.CarryFlag,
&mut self.ZeroFlag,
&mut self.OverflowFlag,
]
.into_iter()
}
}
(playground)
Upsides:
The type is the much more manageable std::vec::IntoIter<&mut bool>.
Still fairly simple to implement.
No external dependencies.
Downsides:
Requires an allocation every time iter_mut is called.
I also mentioned using an array. That would avoid the allocation, but it turns out that arrays don't yet implement an iterator over their values, so the above code with a [&mut bool; 3] instead of a Vec<&mut bool> won't work. However, there exist crates that implement this functionality for fixed-length arrays with limited size, e.g. arrayvec (or array_vec).
Upsides:
No allocation.
Simple iterator type.
Simple to implement.
Downsides:
External dependency.
The last approach I'll talk about is using unsafe. Since this doesn't have many good upsides over the other approaches, I wouldn't recommend it in general. This is mainly to show you how you could implement this.
Like your original code, we'll implement Iterator on our own struct.
impl<'a> IntoIterator for &'a mut StatusRegister {
type IntoIter = StatusRegisterIterMut<'a>;
type Item = &'a mut bool;
fn into_iter(self) -> Self::IntoIter {
StatusRegisterIterMut {
status: self,
index: 0,
}
}
}
pub struct StatusRegisterIterMut<'a> {
status: &'a mut StatusRegister,
index: usize,
}
The unsafety comes from the next method, where we'll have to (essentially) convert something of type &mut &mut T to &mut T, which is generally unsafe. However, as long as we ensure that next isn't allowed to alias these mutable references, we should be fine. There may be some other subtle issues, so I won't guarantee that this is sound. For what it's worth, MIRI doesn't find any problems with this.
impl<'a> Iterator for StatusRegisterIterMut<'a> {
type Item = &'a mut bool;
// Invariant to keep: index is 0, 1, 2 or 3
// Every call, this increments by one, capped at 3
// index should never be 0 on two different calls
// and similarly for 1 and 2.
fn next(&mut self) -> Option<Self::Item> {
let result = unsafe {
match self.index {
// Safety: Since each of these three branches are
// executed exactly once, we hand out no more than one mutable reference
// to each part of self.status
// Since self.status is valid for 'a
// Each partial borrow is also valid for 'a
0 => &mut *(&mut self.status.CarryFlag as *mut _),
1 => &mut *(&mut self.status.ZeroFlag as *mut _),
2 => &mut *(&mut self.status.OverflowFlag as *mut _),
_ => return None
}
};
// If self.index isn't 0, 1 or 2, we'll have already returned
// So this bumps us up to 1, 2 or 3.
self.index += 1;
Some(result)
}
}
(playground)
Upsides:
No allocations.
Simple iterator type name.
No external dependencies.
Downsides:
Complicated to implement. To successfully use unsafe, you need to be very familiar with what is and isn't allowed. This part of the answer took me the longest by far to make sure I wasn't doing something wrong.
Unsafety infects the module. Within the module defining this iterator, I could "safely" cause unsoundness by messing with the status or index fields of StatusRegisterIterMut. The only thing allowing encapsulation is that outside of this module, those fields aren't visible.

How to do Type-Length-Value (TLV) serialization with Serde?

I need to serialize a class of structs according to the TLV format with Serde. TLV can be nested in a tree format.
The fields of these structs are serialized normally, much like bincode does, but before the field data I must include a tag (to be associated, ideally) and the length, in bytes, of the field data.
Ideally, Serde would recognize the structs that need this kind of serialization, probably by having them implement a TLV trait. This part is optional, as I can also explicitly annotate each of these structs.
So this question breaks down in 3 parts, in order of priority:
How do I get the length data (from Serde?) before the serialization of that data has been performed?
How do I associate tags with structs (though I guess I could also include tags inside the structs..)?
How do I make Serde recognize a class of structs and apply custom serialization?
Note that 1) is the (core) question here. I will post 2) and 3) as individual questions if 1) can be solved with Serde.
Brace yourself, long post. Also, for convention: I'm picking both type and length to be unsigned 4 byte big endian. Let's start with the easy stuff:
How do I make Serde recognize a class of structs and apply custom serialization?
That's really a separate question, but you can either do that via the #[serde(serialize_with = …)] attributes, or in your serializer's fn serialize_struct(self, name: &'static str, _: usize) based on the name, depending on what exactly you have in mind.
How do I associate tags with structs (though I guess I could also include tags inside the structs..)?
This is a known limitation of serde, and the reason protobuf implementations typicall aren't based on serde (take e.g. prost), but have their own derive proc macros that allow to annotate structs and fields with the respective tags. You should probably do the same as it's clean and fast. But since you asked about serde, I'll pick an alternative inspired by serde_protobuf: if you look at it from a weird angle, serde is just a visitor-based reflection framework. It will provide you with structure information about the type you're currently (de-)serializing, e.g. it'll tell you type and name and fields of the type your visiting. All you need is a (user-supplied) function that maps from this type information to the tags. For example:
struct TLVSerializer<'a> {
ttf: &'a dyn Fn(TypeTagFor) -> u32,
…
}
impl<'a> Serializer for TLVSerializer<'a> {
fn serialize_bool(self, v: bool) -> Result<Self::Ok, Self::Error> {
let tag = &(self.ttf)(TypeTagFor::Bool).to_be_bytes();
let len = &1u32.to_be_bytes();
todo!("write");
}
fn serialize_i32(self, v: i32) -> Result<Self::Ok, Self::Error> {
let tag = &(self.ttf)(TypeTagFor::Int {
signed: true,
width: 4,
})
.to_be_bytes();
let len = &4u32.to_be_bytes();
todo!("write");
}
}
Then, you need to write a function that supplies the tags, e.g. something like:
enum TypeTagFor {
Bool,
Int { width: u8, signed: bool },
Struct { name: &'static str },
// ...
}
fn foobar_type_tag_for(ttf: TypeTagFor) -> u32 {
match ttf {
TypeTagFor::Int {
width: 4,
signed: true,
} => 0x69333200,
TypeTagFor::Bool => 0x626f6f6c,
_ => unreachable!(),
}
}
If you only have one set of type → tag mappings, you could also put it into the serializer directly.
How do I get the length data (from Serde?) before the serialization of that data has been performed?
The short answer is: Can't. The length can't be known without inspecting the entire structure (there could be Vecs in it, e.g.). But that also tells you what you need to do: You need to inspect the entire structure first, deduce the length, and then do the serialization. And you have precisely one method for inspecting the entire structure at hand: serde. So, you'll write a serializer that doesn't actually serialize anything and only records the length:
struct TLVLenVisitor;
impl Serializer for TLVLenVisitor {
type Ok = usize;
type SerializeSeq = TLVLenSumVisitor;
fn serialize_i32(self, _v: i32) -> Result<Self::Ok, Self::Error> {
Ok(4)
}
fn serialize_str(self, str: &str) -> Result<Self::Ok, Self::Error> {
Ok(str.len())
}
fn serialize_seq(self, _len: Option<usize>) -> Result<Self::SerializeSeq, Self::Error> {
Ok(TLVLenSumVisitor { sum: 0 })
}
}
struct TLVLenSumVisitor {
sum: usize,
}
impl serde::ser::SerializeSeq for TLVLenSumVisitor {
type Ok = usize;
fn serialize_element<T: Serialize + ?Sized>(&mut self, value: &T) -> Result<(), Self::Error> {
// The length of a sequence is the length of all its parts, plus the bytes for type tag and length
self.sum += value.serialize(TLVLenVisitor)? + HEADER_LEN;
Ok(())
}
fn end(self) -> Result<Self::Ok, Self::Error> {
Ok(self.sum)
}
}
Fortunately, serialization is non-destructive, so you can use this first serializer to get the length, and then do the actual serialization in a second pass:
let len = foobar.serialize(TLVLenVisitor).unwrap();
foobar.serialize(TLVSerializer {
target: &mut File::create("foobar").unwrap(), // No seeking performed on the file
len,
ttf: &foobar_type_tag_for,
})
.unwrap();
Since you already know the length of what you're serializing, the second serializer is relatively straightforward:
struct TLVSerializer<'a> {
target: &'a mut dyn Write, // Using dyn to reduce verbosity of the example
len: usize,
ttf: &'a dyn Fn(TypeTagFor) -> u32,
}
impl<'a> Serializer for TLVSerializer<'a> {
type Ok = ();
type SerializeSeq = TLVSeqSerializer<'a>;
// Glossing over error handling here.
fn serialize_seq(self, _len: Option<usize>) -> Result<Self::SerializeSeq, Self::Error> {
self.target
.write_all(&(self.ttf)(TypeTagFor::Seq).to_be_bytes())
.unwrap();
// Normally, there'd be no way to find the length here.
// But since TLVSerializer has been told, there's no problem
self.target
.write_all(&u32::try_from(self.len).unwrap().to_be_bytes())
.unwrap();
Ok(TLVSeqSerializer {
target: self.target,
ttf: self.ttf,
})
}
}
The only snag you may hit is that the TLVLenVisitor only gave you one length. But you have many TLV-structures, recursively nested. When you want to write out one of the nested structures (e.g. a Vec), you just run the TLVLenVisitor again, for each element.
struct TLVSeqSerializer<'a> {
target: &'a mut dyn Write,
ttf: &'a dyn Fn(TypeTagFor) -> u32,
}
impl<'a> serde::ser::SerializeSeq for TLVSeqSerializer<'a> {
type Ok = ();
fn serialize_element<T: Serialize + ?Sized>(&mut self, value: &T) -> Result<(), Self::Error> {
value.serialize(TLVSerializer {
// Getting the length of a subfield here
len: value.serialize(TLVLenVisitor)?,
target: self.target,
ttf: self.ttf,
})
}
fn end(self) -> Result<Self::Ok, Self::Error> {
Ok(())
}
}
Playground
This also means that you may have to do many passes over the structure you're serializing. This might be fine if speed is not of the essence and you're memory-constrained, but in general, I don't think it's a good idea. You may be tempted to try to get all the lengths in the entire structure in a single pass, which can be done, but it'll either be brittle (since you'd have to rely on visiting order) or difficult (because you'd have to build a shadow structure which contains all the lengths).
Also, do note that this approach expects that two serializer invocations of the same struct traverse the same structure. But an implementer of Serialize is perfectly capable to generating random data on the fly or mutating itself via internal mutability. Which would make this serializer generate invalid data. You can ignore that problem since it's far-fetched, or add a check to the end call and make sure the written length matches the actual written data.
Really, I think it'd be best if you don't worry about finding the length before serialization and wrote the serialization result to memory first. To do so, you can first write all length fields as a dummy value to a Vec<u8>:
struct TLVSerializer<'a> {
target: &'a mut Vec<u8>,
ttf: &'a dyn Fn(TypeTagFor) -> u32,
}
impl<'a> Serializer for TLVSerializer<'a> {
type Ok = ();
type SerializeSeq = TLVSeqSerializer<'a>;
fn serialize_seq(self, _len: Option<usize>) -> Result<Self::SerializeSeq, Self::Error> {
let idx = self.target.len();
self.target
.extend((self.ttf)(TypeTagFor::Seq).to_be_bytes());
// Writing dummy length here
self.target.extend(u32::MAX.to_be_bytes());
Ok(TLVSeqSerializer {
target: self.target,
idx,
ttf: self.ttf,
})
}
}
Then after you serialize the content and know its length, you can overwrite the dummies:
struct TLVSeqSerializer<'a> {
target: &'a mut Vec<u8>,
idx: usize, // This is how it knows where it needs to write the length
ttf: &'a dyn Fn(TypeTagFor) -> u32,
}
impl<'a> serde::ser::SerializeSeq for TLVSeqSerializer<'a> {
type Ok = ();
fn serialize_element<T: Serialize + ?Sized>(&mut self, value: &T) -> Result<(), Self::Error> {
value.serialize(TLVSerializer {
target: self.target,
ttf: self.ttf,
})
}
fn end(self) -> Result<Self::Ok, Self::Error> {
end(self.target, self.idx)
}
}
fn end(target: &mut Vec<u8>, idx: usize) -> Result<(), std::fmt::Error> {
let len = u32::try_from(target.len() - idx - HEADER_LEN)
.unwrap()
.to_be_bytes();
target[idx + 4..][..4].copy_from_slice(&len);
Ok(())
}
Playground. And there you go, single pass TLV serialization with serde.

How can I implement Ord when the comparison depends on data not part of the compared items?

I have a small struct containing only an i32:
struct MyStruct {
value: i32,
}
I want to implement Ord in order to store MyStruct in a BTreeMap or any other data structure that requires you to have Ord on its elements.
In my case, comparing two instances of MyStruct does not depend on the values in them, but asking another data structure (a dictionary), and that data structure is unique for each instance of the BTreeMap I will create. So ideally it would look like this:
impl Ord for MyStruct {
fn cmp(&self, other: &Self, dict: &Dictionary) -> Ordering {
dict.lookup(self.value).cmp(dict.lookup(other.value))
}
}
However this won't be possible, since an Ord implementation only can access two instances of MyStruct, nothing more.
One solution would be storing a pointer to the dictionary in MyStruct but that's overkill. MyStruct is supposed to be a simple wrapper and the pointer would double its size. Another solution is to use a static global, but that's not a good solution either.
In C++ the solution would be easy: Most STL algorithms/data structures let you pass a comparator, where it can be a function object with some state. So I believe Rust would have an idiom to match this somehow, is there any way to accomplish this?
Rust (more specifically Rust's libcollections) currently has no comparator-like construct, so using a mutable static is probably your best bet. This is also used within rustc, e.g. the string interner is static. With that said, the use case isn't exactly uncommon, so maybe if we petition for it, Rust will get external comparators one day.
I remember the debate over whether allowing a custom comparator was worth it or not, and it was decided that this complicated the API a lot when most of the times one could achieve the same effect by using a new (wrapping) type and redefine PartialOrd for it.
It was, ultimately, a trade-off: weighing API simplicity versus unusual needs (which are probably summed up as access to external resources).
In your specific case, there are two solutions:
use the API the way it was intended: create a wrapper structure containing both an instance of MyStruct and a reference to the dictionary, then define Ord on that wrapper and use this as key in the BTreeMap
circumvent the API... somehow
I would personally advise starting with using the API as intended, and measure, before going down the road of trying to circumvent it.
#ker was kind enough to provide the following illustration of achieving wrapping in comments (playground version):
#[derive(Eq, PartialEq, Debug)]
struct MyStruct {
value: i32,
}
#[derive(Debug)]
struct MyStructAsKey<'a> {
inner: MyStruct,
dict: &'a Dictionary,
}
impl<'a> Eq for MyStructAsKey<'a> {}
impl<'a> PartialEq for MyStructAsKey<'a> {
fn eq(&self, other: &Self) -> bool {
self.inner == other.inner && self.dict as *const _ as usize == other.dict as *const _ as usize
}
}
impl<'a> Ord for MyStructAsKey<'a> {
fn cmp(&self, other: &Self) -> ::std::cmp::Ordering {
self.dict.lookup(&self.inner).cmp(&other.dict.lookup(&other.inner))
}
}
impl<'a> PartialOrd for MyStructAsKey<'a> {
fn partial_cmp(&self, other: &Self) -> Option<::std::cmp::Ordering> {
Some(self.dict.lookup(&self.inner).cmp(&other.dict.lookup(&other.inner)))
}
}
#[derive(Default, Debug)]
struct Dictionary(::std::cell::RefCell<::std::collections::HashMap<i32, u64>>);
impl Dictionary {
fn ord_key<'a>(&'a self, ms: MyStruct) -> MyStructAsKey<'a> {
MyStructAsKey {
inner: ms,
dict: self,
}
}
fn lookup(&self, key: &MyStruct) -> u64 {
self.0.borrow()[&key.value]
}
fn create(&self, value: u64) -> MyStruct {
let mut map = self.0.borrow_mut();
let n = map.len();
assert!(n as i32 as usize == n);
let n = n as i32;
map.insert(n, value);
MyStruct {
value: n,
}
}
}
fn main() {
let dict = Dictionary::default();
let a = dict.create(99);
let b = dict.create(42);
let mut set = ::std::collections::BTreeSet::new();
set.insert(dict.ord_key(a));
set.insert(dict.ord_key(b));
println!("{:#?}", set);
let c = dict.create(1000);
let d = dict.create(0);
set.insert(dict.ord_key(c));
set.insert(dict.ord_key(d));
println!("{:#?}", set);
}

Same object with different API faces at compile time?

I have an object that can be in either of two modes: a source or a sink. It is always in one of them and it is always known at compile time (when passed the object you know if you are going to read or write to it obviously).
I can put all the methods on the same object, and just assume I won't be called improperly or error when I do, or I was thinking I could be make two
tuple structs of the single underlying object and attach the methods to those tuple structs instead. The methods are almost entirely disjoint.
It is kind of abusing the fact that both tuple structs have the same layout and there is zero overhead for the casts and tuple storage.
Think of this similar to the Java ByteBuffer and related classes where you write then flip then read then flip back and write more. Except this would catch errors in usage.
However, it does seem a little unusual and might be overly confusing for such a small problem. And it seems like there is a better way to do this -- only requirement is zero overhead so no dynamic dispatch.
https://play.rust-lang.org/?gist=280d2ec2548e4f38e305&version=stable
#[derive(Debug)]
struct Underlying {
a: u32,
b: u32,
}
#[derive(Debug)]
struct FaceA(Underlying);
impl FaceA {
fn make() -> FaceA { FaceA(Underlying{a:1,b:2}) }
fn doa(&self) { println!("FaceA do A {:?}", *self); }
fn dou(&self) { println!("FaceA do U {:?}", *self); }
fn tob(&self) -> &FaceB { unsafe{std::mem::transmute::<&FaceA,&FaceB>(self)} }
}
#[derive(Debug)]
struct FaceB(Underlying);
impl FaceB {
fn dob(&self) { println!("FaceB do B {:?}", *self); }
fn dou(&self) { println!("FaceB do U {:?}", *self); }
fn toa(&self) -> &FaceA { unsafe{std::mem::transmute::<&FaceB,&FaceA>(self)} }
}
fn main() {
let a = FaceA::make();
a.doa();
a.dou();
let b = a.tob();
b.dob();
b.dou();
let aa = b.toa();
aa.doa();
aa.dou();
}
First of all, it seems like you don't understand how ownership works in Rust; you may want to read the Ownership chapter of the Rust Book. Specifically, the way you keep re-aliasing the original FaceA is how you would specifically enable the very thing you say you want to avoid. Also, all the borrows are immutable, so it's not clear how you intend to do any sort of mutation.
As such, I've written a new example from scratch that involves going between two types with disjoint interfaces (view on playpen).
#[derive(Debug)]
pub struct Inner {
pub value: i32,
}
impl Inner {
pub fn new(value: i32) -> Self {
Inner {
value: value,
}
}
}
#[derive(Debug)]
pub struct Upper(Inner);
impl Upper {
pub fn new(inner: Inner) -> Self {
Upper(inner)
}
pub fn into_downer(self) -> Downer {
Downer::new(self.0)
}
pub fn up(&mut self) {
self.0.value += 1;
}
}
#[derive(Debug)]
pub struct Downer(Inner);
impl Downer {
pub fn new(inner: Inner) -> Self {
Downer(inner)
}
pub fn into_upper(self) -> Upper {
Upper::new(self.0)
}
pub fn down(&mut self) {
self.0.value -= 1;
}
}
fn main() {
let mut a = Upper::new(Inner::new(0));
a.up();
let mut b = a.into_downer();
b.down();
b.down();
b.down();
let mut c = b.into_upper();
c.up();
show_i32(c.0.value);
}
#[inline(never)]
fn show_i32(v: i32) {
println!("v: {:?}", v);
}
Here, the into_upper and into_downer methods consume the subject value, preventing anyone from using it afterwards (try accessing a after the call to a.into_downer()).
This should not be particularly inefficient; there is no heap allocation going on here, and Rust is pretty good at moving values around efficiently. If you're curious, this is what the main function compiles down to with optimisations enabled:
mov edi, -1
jmp _ZN8show_i3220h2a10d619fa41d919UdaE
It literally inlines the entire program (save for the show function that I specifically told it not to inline). Unless profiling shows this to be a serious performance problem, I wouldn't worry about it.

Resources