Different Target for Deref and DerefMut - rust

I am trying to create a wrapper type around a cxx::UniquePtr<A>.
I want to implement struct B(UniquePtr<A>) with
Deref<Target=UniquePtr<A>> for B
but also
DerefMut<Target=Pin<&mut A>> for B {}
Is this a terrible idea or even possible? The whole point of B is to hide the type complexity of UniquePtr<A>. Right now I am forcing my users to call .pin_mut() every time they want to do something and it is quite ugly and breaks their code auto completion. Does anyone have any suggestions or resources I should look at to clean up the interface?

Related

Is there a way to automatically register trait implementors?

I'm trying to load JSON files that refer to structs implementing a trait. When the JSON files are loaded, the struct is grabbed from a hashmap. The problem is, I'll probably have to have a lot of structs put into that hashmap all over my code. I would like to have that done automatically. To me this seems to be doable with procedural macros, something like:
#[my_proc_macro(type=ImplementedType)]
struct MyStruct {}
impl ImplementedType for MyStruct {}
fn load_implementors() {
let implementors = HashMap::new();
load_implementors!(implementors, ImplementedType);
}
Is there a way to do this?
No
There is a core issue that makes it difficult to skip manually inserting into a structure. Consider this simplified example, where we simply want to print values that are provided separately in the code-base:
my_register!(alice);
my_register!(bob);
fn main() {
my_print(); // prints "alice" and "bob"
}
In typical Rust, there is no mechanism to link the my_print() call to the multiple invocations of my_register. There is no support for declaration merging, run-time/compile-time reflection, or run-before-main execution that you might find in other languages that might make this possible (unless of course there's something I'm missing).
But Also Yes
There are third party crates built around link-time or run-time tricks that can make this possible:
ctor allows you to define functions that are executed before main(). With it, you can have my_register!() create invididual functions for alice and bob that when executed will add themselves to some global structure which can then be accessed by my_print().
linkme allows you to define a slice that is made from elements defined separately, which are combined at compile time. The my_register!() simply needs to use this crate's attributes to add an element to the slice, which my_print() can easily access.
I understand skepticism of these methods since the declarative approach is often clearer to me, but sometimes they are necessary or the ergonomic benefits outweigh the "magic".

Rust E0382 - value used here after move

I am new to Rust and is really struggling with way to write code the Rust way. I understand its rules to enforce memory correctness, however I cannot figure out the changes required to comply in code.
I have created a Tree like object from the json structure recieved from the application.
I am trying to create two operations on tree,
Get the leaves of tree
Get the mapping of parent -> children in a map
The high level code looks like this
fn rename_workspaces(conn: Connection) {
let i3_info = I3Info::new(conn);
let _leaves = i3_info.get_leaves();
let _parent_child = i3_info.dfs_parent_child();
}
However, rust is complaining that i3_info variable has been used after the move. I understand its complaint, however, I cannot figure out what should be the correct Rust way to solve it.
Please help me to figure out the change in thinking required to solve this. This is important, because my application really need to perform these calculations on the tree structure multiple times.
Interesting thing is , I am not really mutating the structure, just iterating over it and returning the new / mutated structure from the function.
Source link: https://github.com/madhur/i3-auto-workspace-icons-rust/blob/main/src/main.rs
The problem is that you have declared the methods of I3Info such that they consume (move) the I3Info:
pub fn dfs_parent_child(self) ...
pub fn get_leaves(self) ...
To not consume the I3Info, allowing it to be used more than once, declare your methods to take references to the I3Info:
pub fn dfs_parent_child(&self) ...
pub fn get_leaves(&self) ...
You will need to modify the code within these methods, also, to work with references because this change also means you can no longer move things out of self — they have to be left intact. Sometimes this is as simple as putting & before a field access (&self.foo instead of self.foo), and sometimes it will require more extensive changes.
The general “Rust way of thinking” lessons here are:
Think about the type of your method receivers. self is not always right, and neither is &self.
Don't take ownership of values except when it makes sense. Passing by & reference is a good default choice (except for Copy types, like numbers).

Why can I access struct fields by a variable and the reference to that variable in the same way? (Rust)

If I print x.passwd, I will get 234
If I print y.passwd, I will get 234 too, But how is that possible since y = &x (essentially storing the address of x), shouldnt I be dereferencing in order to access passwd like (*y).passwd?
I was solving a leetcode problem and they were accessing a node's val field directly by the reference without dereferencing and that made me more confused about references.
On Left hand size, we have Option<Box> while on the right we have &Option<Box>, How can we perform Some(node) = node
PS: I Hope someone explains with a memory diagram of what is actually happening. And if anyone has good resources to understand references and borrowing, Please Let me know, I have been referring the docs and Lets Get Rusty youtube channel but still references are little confusing for me.
In Rust, the . operator will automatically dereference as necessary to find something with the right name. In y.passwd, y is a reference, but references don't have any named fields, so the compiler tries looking at the type of the referent — Cred — and does find the field named passwd.
The same thing works with methods, but there's some more to it — in addition to dereferencing, the compiler will also try adding an & to find a matching method. That way you don't have to write the awkward (&foo).bar() just to call a method that takes &self, similar to how you've already found that you don't have to write (*y).passwd.
In general, you rarely (but not never) have to worry about whether or not something is a reference, when using ..

Can anyone show me a solution to Iterate all IPv4 addresses found in a file?

When I am experimenting with a new language that I'm unfamiliar with, my hello world is listing all IPv4 found in a mixed file (for example a log file). I think it is a good exercise because it gets me to practice with IO, packages, functions, regexes and iterators.
I tried for 2-3 hours to accomplish that in Rust, I still haven't found any elegant way to do it. I'm obviously doing it wrong.
Can anyone show me their solution to achieve this? It will help my brain to unlock by seeing the most efficient/elegant way. Or do you recommend me to continue to bash on the pile until I have it right?
Passing a file name to a function, which returns iterator of all IPv4 in that file.
I saw that Rust support iterators as well as generators/yield. I would like to see solutions for both if possible.
For the simplicity I avoid error handling (with unwrap and expect), since it may harm readability. For that kind of task you don't need an external crates (e.g. regexes), because some parsing already implemented in standard library with FromStr. For the per-line reading a BufRead trait with BufReader wrapper might do the thing. And composed it becomes (playground):
fn iterate_over_ips(filename: impl AsRef<Path>) -> impl Iterator<Item = Ipv4Addr> {
let file = File::open(filename).unwrap();
io::BufReader::new(file)
.lines()
.map(|line| line.expect("line read").parse().expect("ip invalid format"))
}
Generator is an unstable feature (so its API may change anytime) and for now mostly used internally by a compiler for asynchronous code. Iterators are way better for this particular task.

Is it safe and defined behavior to transmute between a T and an UnsafeCell<T>?

A recent question was looking for the ability to construct self-referential structures. In discussing possible answers for the question, one potential answer involved using an UnsafeCell for interior mutability and then "discarding" the mutability through a transmute.
Here's a small example of such an idea in action. I'm not deeply interested in the example itself, but it's just enough complication to require a bigger hammer like transmute as opposed to just using UnsafeCell::new and/or UnsafeCell::into_inner:
use std::{
cell::UnsafeCell, mem, rc::{Rc, Weak},
};
// This is our real type.
struct ReallyImmutable {
value: i32,
myself: Weak<ReallyImmutable>,
}
fn initialize() -> Rc<ReallyImmutable> {
// This mirrors ReallyImmutable but we use `UnsafeCell`
// to perform some initial interior mutation.
struct NotReallyImmutable {
value: i32,
myself: Weak<UnsafeCell<NotReallyImmutable>>,
}
let initial = NotReallyImmutable {
value: 42,
myself: Weak::new(),
};
// Without interior mutability, we couldn't update the `myself` field
// after we've created the `Rc`.
let second = Rc::new(UnsafeCell::new(initial));
// Tie the recursive knot
let new_myself = Rc::downgrade(&second);
unsafe {
// Should be safe as there can be no other accesses to this field
(&mut *second.get()).myself = new_myself;
// No one outside of this function needs the interior mutability
// TODO: Is this call safe?
mem::transmute(second)
}
}
fn main() {
let v = initialize();
println!("{} -> {:?}", v.value, v.myself.upgrade().map(|v| v.value))
}
This code appears to print out what I'd expect, but that doesn't mean that it's safe or using defined semantics.
Is transmuting from a UnsafeCell<T> to a T memory safe? Does it invoke undefined behavior? What about transmuting in the opposite direction, from a T to an UnsafeCell<T>?
(I am still new to SO and not sure if "well, maybe" qualifies as an answer, but here you go. ;)
Disclaimer: The rules for these kinds of things are not (yet) set in stone. So, there is no definitive answer yet. I'm going to make some guesses based on (a) what kinds of compiler transformations LLVM does/we will eventually want to do, and (b) what kind of models I have in my head that would define the answer to this.
Also, I see two parts to this: The data layout perspective, and the aliasing perspective. The layout issue is that NotReallyImmutable could, in principle, have a totally different layout than ReallyImmutable. I don't know much about data layout, but with UnsafeCell becoming repr(transparent) and that being the only difference between the two types, I think the intent is for this to work. You are, however, relying on repr(transparent) being "structural" in the sense that it should allow you to replace things in larger types, which I am not sure has been written down explicitly anywhere. Sounds like a proposal for a follow-up RFC that extends the repr(transparent) guarantees appropriately?
As far as aliasing is concerned, the issue is breaking the rules around &T. I'd say that, as long as you never have a live &T around anywhere when writing through the &UnsafeCell<T>, you are good -- but I don't think we can guarantee that quite yet. Let's look in more detail.
Compiler perspective
The relevant optimizations here are the ones that exploit &T being read-only. So if you reordered the last two lines (transmute and the assignment), that code would likely be UB as we may want the compiler to be able to "pre-fetch" the value behind the shared reference and re-use that value later (i.e. after inlining this).
But in your code, we would only emit "read-only" annotations (noalias in LLVM) after the transmute comes back, and the data is indeed read-only starting there. So, this should be good.
Memory models
The "most aggressive" of my memory models essentially asserts that all values are always valid, and I think even that model should be fine with your code. &UnsafeCell is a special case in that model where validity just stops, and nothing is said about what lives behind this reference. The moment the transmute returns, we grab the memory it points to and make it all read-only, and even if we did that "recursively" through the Rc (which my model doesn't, but only because I couldn't figure out a good way to make it do so) you'd be fine as you don't mutate any more after the transmute. (As you may have noticed, this is the same restriction as in the compiler perspective. The point of these models is to allow compiler optimizations, after all. ;)
(As a side-note, I really wish miri was in better shape right now. Seems I have to try and get validation to work again in there, because then I could tell you to just run your code in miri and it'd tell you if that version of my model is okay with what you are doing :D )
I am thinking about other models currently that only check things "on access", but haven't worked out the UnsafeCell story for that model yet. What this example shows is that the model may have to contain ways for a "phase transition" of memory first being UnsafeCell, but later having normal sharing with read-only guarantees. Thanks for bringing this up, that will make for some nice examples to think about!
So, I think I can say that (at least from my side) there is the intent to allow this kind of code, and doing so does not seem to prevent any optimizations. Whether we'll actually manage to find a model that everybody can agree with and that still allows this, I cannot predict.
The opposite direction: T -> UnsafeCell<T>
Now, this is more interesting. The problem is that, as I said above, you must not have a &T live when writing through an UnsafeCell<T>. But what does "live" mean here? That's a hard question! In some of my models, this could be as weak as "a reference of that type exists somewhere and the lifetime is still active", i.e., it could have nothing to do with whether the reference is actually used. (That's useful because it lets us do more optimizations, like moving a load out of a loop even if we cannot prove that the loop ever runs -- which would introduce a use of an otherwise unused reference.) And since &T is Copy, you cannot even really get rid of such a reference either. So, if you have x: &T, then after let y: &UnsafeCell<T> = transmute(x), the old x is still around and its lifetime still active, so writing through y could well be UB.
I think you'd have to somehow restrict the aliasing that &T allows, very carefully making sure that nobody still holds such a reference. I'm not going to say "this is impossible" because people keep surprising me (especially in this community ;) but TBH I cannot think of a way to make this work. I'd be curious if you have an example though where you think this is reasonable.

Resources