How do I appease the borrow checker with nested data structures?

How do I appease the borrow checker with nested data structures? - rust

I'm working on a (rather ambitious) text editor, and I'm trying to implement arbitrary vertical and horizontal splits to display text buffers in, something like this:
buffer
-----------------
buffer |
-------| buffer
buffer |
I have this structure represented as a binary tree-type thing:
h
/ \
v b
/ \
b h
/ \
b b
where v is a vertical split, h is a horizontal split, and b is a buffer.
In code form, it is this:
pub enum LayoutNode {
Buf(Buffer),
Split(Box<Split>),
}
pub enum Split {
Vertical(LayoutNode, LayoutNode),
Horizontal(LayoutNode, LayoutNode)
}
pub struct Buffer {
content: String,
// more buffer-related stuff
}
All is well. My vertical split method:
impl LayoutNode {
pub fn vertical_split(layout: LayoutNode) -> LayoutNode {
LayoutNode::Split(Box::new(Split::Vertical(layout, LayoutNode::Buf(Buffer::new()))))
// Buffer::new() returns an empty Buffer
}
}
This function compiles, but is not the whole story. I have a data structure responsible for the editor's layout nodes, called Editor:
impl Editor {
pub fn new() -> Editor {
Editor {
buffers: LayoutNode::Buf(Buffer::empty()),
// more editor-related stuff
}
}
pub fn vertical_split(&mut self) {
// buffers needs to be a part of itself
self.buffers = LayoutNode::vertical_split(self.buffers);
// cannot move out of borrowed content ^
}
}
I've taken a look at mem::replace but I'm not sure that it's what I need for this case with nested data structures. The rustc --explain page for E0507 isn't very helpful in this regard.
How do I work with the borrow checker in this case? I'd rather not clone everything, since that will easily waste a lot of memory with a new copy of every file upon each split.

mem::replace is often used in situations like this to set the field to a dummy, but valid value while you're producing the new value. This is necessary to ensure that if the thread panics while producing the new value, the destructor will not free the same objects twice.
In your case, it might look something like this:
impl Editor {
pub fn new() -> Editor {
Editor {
buffers: LayoutNode::Buf(Buffer::empty()),
}
}
pub fn vertical_split(&mut self) {
// buffers needs to be a part of itself
self.buffers = LayoutNode::vertical_split(
mem::replace(&mut self.buffers, LayoutNode::Buf(Buffer::empty())));
}
}
It works like this: mem::replace receives a mutable reference to the variable or field to replace and the value to assign, and returns the old value. You get ownership of the result, so you can move it freely.

Related

How do I move out of a struct that has a destructor, if I don't want to run the destructor?

Suppose I have a Rust struct that has a destructor:
pub struct D1 {
a: String,
b: String
}
impl Drop for D1 {
fn drop(&mut self) {
println!("{0}", self.a)
}
}
I want to implement a method that moves some or all of the fields out of the object, destroying it in the process. Obviously, that can't be done safely if the object's destructor gets run – it would attempt to unsafely read from the moved fields. However, it should be safe if the destructor is suppressed somehow. But I can't find a way to suppress the destructor in a way that makes the move safe.
mem::forget doesn't work, because the compiler doesn't treat it specially when it comes to destructors:
impl D1 {
fn into_raw_parts(self) -> (String, String) {
let (a, b) = (self.a, self.b); // cannot move here
mem::forget(self);
(a, b)
}
}
and ManuallyDrop doesn't work either – it provides the interior data only via a reference, preventing a move:
use std::mem;
impl D1 {
fn into_raw_parts(self) -> (String, String) {
let md = mem::ManuallyDrop::new(self);
(md.a, md.b) // cannot move out of dereference
}
}
use std::mem;
impl D1 {
fn into_raw_parts(self) -> (String, String) {
let md = mem::ManuallyDrop::new(self);
(md.value.a, md.value.b) // field `value` is private
}
}
I could presumably solve this problem by dipping into unsafe code, but I'm wondering whether there's a safe way to do this – have I missed some API or other technique to tell Rust that I can safely move out of a value because I'm not planning to run the destructor?

There is no way in safe code to leave a type implementing Drop uninitialized. However, you can leave it initialized with dummy values:
use std::mem;
impl D1 {
fn into_raw_parts(self) -> (String, String) {
(mem::take(&mut self.a), mem::take(&mut self.b))
}
}
This will leave self with empty strings (which do not even allocate any memory). Of course, the Drop implementation has to tolerate this and not do something undesired.
If the fields’ types in your real situation do not have convenient placeholder values, then you can make the fields be Options. This is also a common technique when Drop::drop() itself needs to move out of self.

How can I move structs containing Vec to/from a static array without getting E0507?

I need a static array of structs and the structs contain a Vec. I can manage the lifetimes of the actual values. I get the following error:
: Mar23 ; cargo test
Compiling smalltalk v0.1.0 (/Users/dmason/git/AST-Smalltalk/rust)
error[E0507]: cannot move out of `dispatchTable[_]` as `dispatchTable` is a static item
--> src/minimal.rs:32:44
|
30 | let old = ManuallyDrop::into_inner(dispatchTable[pos]);
| ^^^^^^^^^^^^^^^^^^ move occurs because `dispatchTable[_]` has type `ManuallyDrop<Option<Box<Dispatch>>>`, which does not implement the `Copy` trait
error: aborting due to previous error
Here is a minimal compilable example:
#[derive(Copy, Clone)]
struct MethodMatch {
hash: i64,
method: Option<bool>,
}
#[derive(Clone)]
pub struct Dispatch {
class: i64,
table: Vec<MethodMatch>,
}
const max_classes : usize = 100;
use std::mem::ManuallyDrop;
const no_dispatch : ManuallyDrop<Option<Box<Dispatch>>> = ManuallyDrop::new(None);
static mut dispatchTable : [ManuallyDrop<Option<Box<Dispatch>>>;max_classes] = [no_dispatch;max_classes];
use std::sync::RwLock;
lazy_static! {
static ref dispatchFree : RwLock<usize> = {RwLock::new(0)};
}
pub fn addClass(c : i64, n : usize) {
let mut index = dispatchFree.write().unwrap();
let pos = *index;
*index += 1;
replaceDispatch(pos,c,n);
}
pub fn replaceDispatch(pos : usize, c : i64, n : usize) -> Option<Box<Dispatch>> {
let mut table = Vec::with_capacity(n);
table.resize(n,MethodMatch{hash:0,method:None});
unsafe {
let old = ManuallyDrop::into_inner(dispatchTable[pos]);
dispatchTable[pos]=ManuallyDrop::new(Some(Box::new(Dispatch{class:c,table:table})));
old
}
}
The idea I had was to have replaceDispatch create a new Dispatch option object, and replace the current value in the array with the new one, returning the original, with the idea that the caller will get the Dispatch option value and be able to use and then drop/deallocate the object.
I found that it will compile if I add .clone() right after the identified error point. But then the original value never gets dropped, so (the into_inner is redundant and) I'm creating a memory leak!. Do I have to manually drop it (if I could figure out how)? I thought that's what ManuallyDrop bought me. In theory, if I created a copy of the fields from the Vec into a copy, that would point to the old data, so when that object got dropped, the memory would get freed. But (a) that seems very dirty, (b) it's a bit of ugly, unnecessary code (I have to handle the Some/None cases, look inside the Vec, etc.), and (c) I can't see how I'd even do it!!!!

As the compiler tells you, you cannot move a value out of a place observable by others. But since you have the replacement at the ready, you can use std::mem::replace:
pub fn replaceDispatch(pos: usize, c: i64, n: usize) -> Option<Box<Dispatch>> {
... table handling omitted ...
unsafe {
let old = std::mem::replace(
&mut dispatchTable[pos],
ManuallyDrop::new(Some(Box::new(Dispatch {
class: c,
table: table,
}))),
);
ManuallyDrop::into_inner(old)
}
}
Playground
In fact, since you're using the Option to manage the lifetime of Dispatch, you don't need ManuallyDrop at all, and you also don't need the Box: playground.

Killing a process stored in Mutex<Option<Child>> [duplicate]

I want to collect changes to a struct and apply them all at once.
The basic outline looks like this:
enum SomeEnum {
Foo,
Bar,
}
struct SomeStruct {
attrib: SomeEnum,
next_attrib: Option<SomeEnum>,
}
impl SomeStruct {
pub fn apply_changes(&mut self) {
if let Some(se) = self.next_attrib {
self.attrib = se;
}
self.next_attrib = None;
}
}
which yields the following compiler error:
error[E0507]: cannot move out of borrowed content
--> src/lib.rs:13:27
|
13 | if let Some(se) = self.next_attrib {
| -- ^^^^ cannot move out of borrowed content
| |
| hint: to prevent move, use `ref se` or `ref mut se`
I found Get an enum field from a struct: cannot move out of borrowed content and added #[derive(Clone, Copy)] to my enum's definition.
This may work but I feel uncomfortable about (implicitly) using copying since this could generally happen to larger datatypes as well.
The actual owner is never moved out of the struct.
Is there another way to accomplish this, without exposing the Copy/Clone traits to all users of the enum?

Essentially, you can't assign the value to self.attrib if it's still owned by self.next_atrrib. That means you need to remove the value from self.next_attrib and then give ownership to self.attrib.
One way to do this would be to manually replace the value. For instance, you could use std::mem::replace to replace the value with None and take ownership of the current value as next_attrib. Then you can take the value and, if it is Some(_), you can place its content in self.attrib.:
impl SomeStruct {
pub fn apply_changes(&mut self) {
let next_attrib = std::mem::replace(&mut self.next_attrib, None);
if let Some(se) = next_attrib {
self.attrib = se;
}
}
}
Since this is a relatively common pattern, however, there is a utility function on Option to handle situations where you'd like to take ownership of the contents of an Option and set the Option to None. The Option::take method is what you want.
impl SomeStruct {
pub fn apply_changes(&mut self) {
if let Some(se) = self.next_attrib.take() {
self.attrib = se;
}
}
}
See also:
How can I swap in a new value for a field in a mutable reference to a structure?

Can a type know when a mutable borrow to itself has ended?

I have a struct and I want to call one of the struct's methods every time a mutable borrow to it has ended. To do so, I would need to know when the mutable borrow to it has been dropped. How can this be done?

Disclaimer: The answer that follows describes a possible solution, but it's not a very good one, as described by this comment from Sebastien Redl:
[T]his is a bad way of trying to maintain invariants. Mostly because dropping the reference can be suppressed with mem::forget. This is fine for RefCell, where if you don't drop the ref, you will simply eventually panic because you didn't release the dynamic borrow, but it is bad if violating the "fraction is in shortest form" invariant leads to weird results or subtle performance issues down the line, and it is catastrophic if you need to maintain the "thread doesn't outlive variables in the current scope" invariant.
Nevertheless, it's possible to use a temporary struct as a "staging area" that updates the referent when it's dropped, and thus maintain the invariant correctly; however, that version basically amounts to making a proper wrapper type and a kind of weird way to use it. The best way to solve this problem is through an opaque wrapper struct that doesn't expose its internals except through methods that definitely maintain the invariant.
Without further ado, the original answer:
Not exactly... but pretty close. We can use RefCell<T> as a model for how this can be done. It's a bit of an abstract question, but I'll use a concrete example to demonstrate. (This won't be a complete example, but something to show the general principles.)
Let's say you want to make a Fraction struct that is always in simplest form (fully reduced, e.g. 3/5 instead of 6/10). You write a struct RawFraction that will contain the bare data. RawFraction instances are not always in simplest form, but they have a method fn reduce(&mut self) that reduces them.
Now you need a smart pointer type that you will always use to mutate the RawFraction, which calls .reduce() on the pointed-to struct when it's dropped. Let's call it RefMut, because that's the naming scheme RefCell uses. You implement Deref<Target = RawFraction>, DerefMut, and Drop on it, something like this:
pub struct RefMut<'a>(&'a mut RawFraction);
impl<'a> Deref for RefMut<'a> {
type Target = RawFraction;
fn deref(&self) -> &RawFraction {
self.0
}
}
impl<'a> DerefMut for RefMut<'a> {
fn deref_mut(&mut self) -> &mut RawFraction {
self.0
}
}
impl<'a> Drop for RefMut<'a> {
fn drop(&mut self) {
self.0.reduce();
}
}
Now, whenever you have a RefMut to a RawFraction and drop it, you know the RawFraction will be in simplest form afterwards. All you need to do at this point is ensure that RefMut is the only way to get &mut access to the RawFraction part of a Fraction.
pub struct Fraction(RawFraction);
impl Fraction {
pub fn new(numerator: i32, denominator: i32) -> Self {
// create a RawFraction, reduce it and wrap it up
}
pub fn borrow_mut(&mut self) -> RefMut {
RefMut(&mut self.0)
}
}
Pay attention to the pub markings (and lack thereof): I'm using those to ensure the soundness of the exposed interface. All three types should be placed in a module by themselves. It would be incorrect to mark the RawFraction field pub inside Fraction, since then it would be possible (for code outside the module) to create an unreduced Fraction without using new or get a &mut RawFraction without going through RefMut.
Supposing all this code is placed in a module named frac, you can use it something like this (assuming Fraction implements Display):
let f = frac::Fraction::new(3, 10);
println!("{}", f); // prints 3/10
f.borrow_mut().numerator += 3;
println!("{}", f); // prints 3/5
The types encode the invariant: Wherever you have Fraction, you can know that it's fully reduced. When you have a RawFraction, &RawFraction, etc., you can't be sure. If you want, you may also make RawFraction's fields non-pub, so that you can't get an unreduced fraction at all except by calling borrow_mut on a Fraction.

Basically the same thing is done in RefCell. There you want to reduce the runtime borrow-count when a borrow ends. Here you want to perform an arbitrary action.
So let's re-use the concept of writing a function that returns a wrapped reference:
struct Data {
content: i32,
}
impl Data {
fn borrow_mut(&mut self) -> DataRef {
println!("borrowing");
DataRef { data: self }
}
fn check_after_borrow(&self) {
if self.content > 50 {
println!("Hey, content should be <= {:?}!", 50);
}
}
}
struct DataRef<'a> {
data: &'a mut Data
}
impl<'a> Drop for DataRef<'a> {
fn drop(&mut self) {
println!("borrow ends");
self.data.check_after_borrow()
}
}
fn main() {
let mut d = Data { content: 42 };
println!("content is {}", d.content);
{
let b = d.borrow_mut();
//let c = &d; // Compiler won't let you have another borrow at the same time
b.data.content = 123;
println!("content set to {}", b.data.content);
} // borrow ends here
println!("content is now {}", d.content);
}
This results in the following output:
content is 42
borrowing
content set to 123
borrow ends
Hey, content should be <= 50!
content is now 123
Be aware that you can still obtain an unchecked mutable borrow with e.g. let c = &mut d;. This will be silently dropped without calling check_after_borrow.

Vec<MyTrait> without N heap allocations?

I'm trying to port some C++ code to Rust. It composes a virtual (.mp4) file from a few kinds of slices (string reference, lazy-evaluated string reference, part of a physical file) and serves HTTP requests based on the result. (If you're curious, see Mp4File which takes advantage of the FileSlice interface and its concrete implementations in http.h.)
Here's the problem: I want require as few heap allocations as possible. Let's say I have a few implementations of resource::Slice that I can hopefully figure out on my own. Then I want to make the one that composes them all:
pub trait Slice : Send + Sync {
/// Returns the length of the slice in bytes.
fn len(&self) -> u64;
/// Writes bytes indicated by `range` to `out.`
fn write_to(&self, range: &ByteRange,
out: &mut io::Write) -> io::Result<()>;
}
// (used below)
struct SliceInfo<'a> {
range: ByteRange,
slice: &'a Slice,
}
/// A `Slice` composed of other `Slice`s.
pub struct Slices<'a> {
len: u64,
slices: Vec<SliceInfo<'a>>,
}
impl<'a> Slices<'a> {
pub fn new() -> Slices<'a> { ... }
pub fn append(&mut self, slice: &'a resource::Slice) { ... }
}
impl<'a> Slice for Slices<'a> { ... }
and use them to append lots and lots of slices with as few heap allocations as possible. Simplified, something like this:
struct ThingUsedWithinMp4Resource {
slice_a: resource::LazySlice,
slice_b: resource::LazySlice,
slice_c: resource::LazySlice,
slice_d: resource::FileSlice,
}
struct Mp4Resource {
slice_a: resource::StringSlice,
slice_b: resource::LazySlice,
slice_c: resource::StringSlice,
slice_d: resource::LazySlice,
things: Vec<ThingUsedWithinMp4Resource>,
slices: resource::Slices
}
impl Mp4Resource {
fn new() {
let mut f = Mp4Resource{slice_a: ...,
slice_b: ...,
slice_c: ...,
slices: resource::Slices::new()};
// ...fill `things` with hundreds of things...
slices.append(&f.slice_a);
for thing in f.things { slices.append(&thing.slice_a); }
slices.append(&f.slice_b);
for thing in f.things { slices.append(&thing.slice_b); }
slices.append(&f.slice_c);
for thing in f.things { slices.append(&thing.slice_c); }
slices.append(&f.slice_d);
for thing in f.things { slices.append(&thing.slice_d); }
f;
}
}
but this isn't working. The append lines cause errors "f.slice_* does not live long enough", "reference must be valid for the lifetime 'a as defined on the block at ...", "...but borrowed value is only valid for the block suffix following statement". I think this is similar to this question about the self-referencing struct. That's basically what this is, with more indirection. And apparently it's impossible.
So what can I do instead?
I think I'd be happy to give ownership to the resource::Slices in append, but I can't put a resource::Slice in the SliceInfo used in Vec<SliceInfo> because resource::Slice is a trait, and traits are unsized. I could do a Box<resource::Slice> instead but that means a separate heap allocation for each slice. I'd like to avoid that. (There can be thousands of slices per Mp4Resource.)
I'm thinking of doing an enum, something like:
enum BasicSlice {
String(StringSlice),
Lazy(LazySlice),
File(FileSlice)
};
and using that in the SliceInfo. I think I can make this work. But it definitely limits the utility of my resource::Slices class. I want to allow it to be used easily in situations I didn't anticipate, preferably without having to define a new enum each time.
Any other options?

You can add a User variant to your BasicSlice enum, which takes a Box<SliceInfo>. This way only the specialized case of users will take the extra allocation, while the normal path is optimized.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How do I appease the borrow checker with nested data structures? - rust

Related

How do I move out of a struct that has a destructor, if I don't want to run the destructor?

How can I move structs containing Vec to/from a static array without getting E0507?

Killing a process stored in Mutex<Option<Child>> [duplicate]

Can a type know when a mutable borrow to itself has ended?

Vec<MyTrait> without N heap allocations?

Categories

Resources