Flattening a nested structure - rust

Looks for wisdom on fixing this borrow-checker/lifetime issue in Rust. I'm trying to flatten a generic nested structure (into an impl Iterator or Vec). It's perhaps a few &s and `s away from working:
fn iter_els(prev_result: Vec<&El>) -> Vec<&El> {
// Iterate over all elements from a tree, starting at the top-level element.
let mut result = prev_result.clone();
for el in prev_result {
for child in &el.children {
result.push(&child.clone());
}
result.extend(iter_els(&el.children));
}
result
}
You'll note that the immediate exception this raises is that iter_els expects a Vec of refs, not a ref itself. When addressing this directly, other issues rear their mischievous heads, as in a game of oxidized, but safe wack-a-mole.
Playground

There are various solutions to this task. One would be to pass the result as an out-parameter to the function:
fn iter_els<'el>(el_top: &'el El, result: &mut Vec<&'el El>) {
result.push(el_top);
for el in &el_top.children {
iter_els(el, result);
}
}
fn main() {
// build top_el as you did
let mut result = Vec::new();
iter_els(&top_el, &mut result);
println!("{:?}", result);
}
Adapting your original approach imho results in a more complex implementation:
fn iter_els<'el>(prev_result: &Vec<&'el El>) -> Vec<&'el El> {
// Iterate over all elements from a tree, starting at the top-level element.
let mut result = prev_result.clone();
for el in prev_result {
for child in &el.children {
result.push(&child);
}
result.extend(iter_els(&el.children.iter().collect()));
}
result
}
fn main() {
// build top_el as you did
println!("{:?}", iter_els(&vec![&top_el]));
}
Alternatively:
fn iter_els<'el>(prev_result: &'el Vec<El>) -> Vec<&'el El> {
// Iterate over all elements from a tree, starting at the top-level element.
let mut result : Vec<_> = prev_result.iter().collect();
for el in prev_result {
for child in &el.children {
result.push(child);
}
result.extend(iter_els(&el.children));
}
result
}
fn main() {
// build top_el as you did
println!("{:?}", iter_els(&vec![top_el]));
}
As you can see, the first approach only operates on an immutable El, and one single result Vec, while the other implementations do not get around clone and collect.
Ideally, you would write a custom Iterator for your tree, but I think this could get quite cumbersome, because this iterator would have to keep track of the current state somehow (maybe can prove me wrong and show that it's actually easy to do).

Related

How can I easily get a reference to a value after it has been moved into a tuple-type enum variant?

I want to move a value into a tuple-type enum variant and obtain a reference to the value after it has been moved. I see how this is possible with an if let statement, but this seems like this should be unnecessary when the particular variant is known statically.
Is there any way to get the reference to the moved value without requiring an if let or match?
This code block is a simple illustration of my question (see below for a more challenging case):
enum Transport {
Car(u32), // horsepower
Horse(String), // name
}
fn do_something(x: &String) {
println!(x);
}
fn main() {
// Can I avoid needing this if, which is clearly redundant?
if let Transport::Horse(ref name) = Transport::Horse("daisy".into()) {
do_something(name);
}
else {
// Can never happen
}
// I tried the following, it gives:
// "error[E0005]: refutable pattern in local binding: `Car(_)` not covered"
let Transport::Horse(ref name) = Transport::Horse("daisy".into());
}
It is easy to find ways to side-step the issue in the above code, since there are no real interface requirements. Consider instead the following example, where I am building a simple API for building trees (where each node can have n children). Nodes have an add_child_node method returning a reference to the node that was added, to allow chaining of calls to quickly build deep trees. (It is debatable whether this is a good API, but that is irrelevant to the question). add_child_node must return a mutable reference to the contents of an enum variant. Is the if let required in this example (without changing the API)?
struct Node {
children: Vec<Child>,
// ...
}
enum Child {
Node(Node),
Leaf
}
impl Node {
fn add_child_node(&mut self, node: Node) -> &mut Node {
self.children.push(Child::Node(node));
// It seems like this if should be unnecessary
if let Some(&mut Child::Node(ref mut x)) = self.children.last() {
return x;
}
// Required to compile, since we must return something
unreachable!();
}
fn add_child_leaf(&mut self) {
// ...
}
}
No. You can use unreachable!() for the else case, and it's usually clear even without message/comment what's going on. The compiler is also very likely to optimize the check away.
If the variants have the same type you can implement AsRef and use the Transport as a &str:
enum Transport {
Car(String),
Horse(String),
}
fn do_something<S: AsRef<str>>(x: &S) {
println!("{}", x.as_ref());
}
impl AsRef<str> for Transport {
fn as_ref(&self) -> &str {
match self {
Transport::Car(s) => s,
Transport::Horse(s) => s,
}
}
}
fn main() {
let transport = Transport::Horse("daisy".into());
do_something(&transport)
}
Playground
Otherwise you need to use a let if binding as you are doing. No need to use an else clause if you don't want to:
if let Transport::Horse(ref name) = Transport::Horse("daisy".into()) {
do_something(name);
}
define From<Transport> for String:
…
impl From<Transport> for String {
fn from(t: Transport) -> String {
match t {
Transport::Car(value) => value.to_string(),
Transport::Horse(name) => name,
}
}
}
fn do_something(x: Transport) {
println!("{}", String::from(x));
}
fn main() {
let horse = Transport::Horse("daisy".to_string());
let car = Transport::Car(150);
do_something(horse);
do_something(car);
}

Unwrap a BTreeSet in rust

In rust, the following function is legal:
fn unwrap<T>(s:Option<T>) -> T {
s.unwrap()
}
It takes ownership of s, panics if it is a None, and returns ownership of the contents of s (which is legal since an Option owns its contents).
I was trying to write a similar function with signature
fn unwrap_set<T>(s: BTreeSet<T>) -> T {
...
}
The idea is that it panics unless s has size 1, in which case it returns the single element. It seems like this should be possible for the same reason unwrap is possible, however none of the methods on BTreeSet had the right signature (they would need to have return type T). The closest was take, and I tried to do
let mut s2 = s;
let v: &T = s2.iter().next().unwrap();
s2.take(v).unwrap()
but this failed.
Is writing a function like unwrap_set possible?
The easiest way to do this would be to use BTreeSet<T>'s implementation of IntoIterator, which would allow you to easily pull owned values out of the set one at a time:
fn unwrap_set<T>(s: BTreeSet<T>) -> T {
let mut it = s.into_iter();
if let Some(first) = it.next() {
if let None = it.next() {
return first;
}
}
panic!("set must have a single value");
}
If you wanted to indirectly rely on IntoIterator you could also use a normal loop, but I don't think it's as readable that way so I probably wouldn't do this:
fn unwrap_set<T>(s: BTreeSet<T>) -> T {
let mut result = None;
for item in s {
// If there is a second value, bail out
if let Some(_) = result {
result = None;
break;
}
result = Some(item);
}
return result.expect("set must have a single value");
}

Adding an append method to a singly linked list

I was looking through the singly linked list example on rustbyexample.com and I noticed the implementation had no append method, so I decided to try and implement it:
fn append(self, elem: u32) -> List {
let mut node = &self;
loop {
match *node {
Cons(_, ref tail) => {
node = tail;
},
Nil => {
node.prepend(elem);
break;
},
}
}
return self;
}
The above is one of many different attempts, but I cannot seem to find a way to iterate down to the tail and modify it, then somehow return the head, without upsetting the borrow checker in some way.
I am trying to figure out a solution that doesn't involve copying data or doing additional bookkeeping outside the append method.
As described in Cannot obtain a mutable reference when iterating a recursive structure: cannot borrow as mutable more than once at a time, you need to transfer ownership of the mutable reference when performing iteration. This is needed to ensure you never have two mutable references to the same thing.
We use similar code as that Q&A to get a mutable reference to the last item (back) which will always be the Nil variant. We then call it and set that Nil item to a Cons. We wrap all that with a by-value function because that's what the API wants.
No extra allocation, no risk of running out of stack frames.
use List::*;
#[derive(Debug)]
enum List {
Cons(u32, Box<List>),
Nil,
}
impl List {
fn back(&mut self) -> &mut List {
let mut node = self;
loop {
match {node} {
&mut Cons(_, ref mut next) => node = next,
other => return other,
}
}
}
fn append_ref(&mut self, elem: u32) {
*self.back() = Cons(elem, Box::new(Nil));
}
fn append(mut self, elem: u32) -> Self {
self.append_ref(elem);
self
}
}
fn main() {
let n = Nil;
let n = n.append(1);
println!("{:?}", n);
let n = n.append(2);
println!("{:?}", n);
let n = n.append(3);
println!("{:?}", n);
}
When non-lexical lifetimes are enabled, this function can be more obvious:
fn back(&mut self) -> &mut List {
let mut node = self;
while let Cons(_, next) = node {
node = next;
}
node
}
As the len method is implemented recursively, I have done the same for the append implementation:
fn append(self, elem: u32) -> List {
match self {
Cons(current_elem, tail_box) => {
let tail = *tail_box;
let new_tail = tail.append(elem);
new_tail.prepend(current_elem)
}
Nil => {
List::new().prepend(elem)
}
}
}
One possible iterative solution would be to implement append in terms of prepend and a reverse function, like so (it won't be as performant but should still only be O(N)):
// Reverses the list
fn rev(self) -> List {
let mut result = List::new();
let mut current = self;
while let Cons(elem, tail) = current {
result = result.prepend(elem);
current = *tail;
}
result
}
fn append(self, elem: u32) -> List {
self.rev().prepend(elem).rev()
}
So, it's actually going to be slightly more difficult than you may think; mostly because Box is really missing a destructive take method which would return its content.
Easy way: the recursive way, no return.
fn append_rec(&mut self, elem: u32) {
match *self {
Cons(_, ref mut tail) => tail.append_rec(elem),
Nil => *self = Cons(elem, Box::new(Nil)),
}
}
This is relatively easy, as mentioned.
Harder way: the recursive way, with return.
fn append_rec(self, elem: u32) -> List {
match self {
Cons(e, tail) => Cons(e, Box::new((*tail).append_rec(elem))),
Nil => Cons(elem, Box::new(Nil)),
}
}
Note that this is grossly inefficient. For a list of size N, we are destroying N boxes and allocating N new ones. In place mutation (the first approach), was much better in this regard.
Harder way: the iterative way, with no return.
fn append_iter_mut(&mut self, elem: u32) {
let mut current = self;
loop {
match {current} {
&mut Cons(_, ref mut tail) => current = tail,
c # &mut Nil => {
*c = Cons(elem, Box::new(Nil));
return;
},
}
}
}
Okay... so iterating (mutably) over a nested data structure is not THAT easy because ownership and borrow-checking will ensure that:
a mutable reference is never copied, only moved,
a mutable reference with an outstanding borrow cannot be modified.
This is why here:
we use {current} to move current into the match,
we use c # &mut Nil because we need a to name the match of &mut Nil since current has been moved.
Note that thankfully rustc is smart enough to check the execution path and detect that it's okay to continue looping as long as we take the Cons branch since we reinitialize current in that branch, however it's not okay to continue after taking the Nil branch, which forces us to terminate the loop :)
Harder way: the iterative way, with return
fn append_iter(self, elem: u32) -> List {
let mut stack = List::default();
{
let mut current = self;
while let Cons(elem, tail) = current {
stack = stack.prepend(elem);
current = take(tail);
}
}
let mut result = List::new();
result = result.prepend(elem);
while let Cons(elem, tail) = stack {
result = result.prepend(elem);
stack = take(tail);
}
result
}
In the recursive way, we were using the stack to keep the items for us, here we use a stack structure instead.
It's even more inefficient than the recursive way with return was; each node cause two deallocations and two allocations.
TL;DR: in-place modifications are generally more efficient, don't be afraid of using them when necessary.

Convert vectors to arrays and back [duplicate]

This question already has an answer here:
Is there a good way to convert a Vec<T> to an array?
(1 answer)
Closed 7 years ago.
I am attempting to figure the most Rust-like way of converting from a vector to array and back. These macros will work and can even be made generic with some unsafe blocks, but it all feels very un-Rust like.
I would appreciate any input and hold no punches, I think this code is far from nice or optimal. I have only played with Rust for a few weeks now and chasing releases and docs so really appreciate help.
macro_rules! convert_u8vec_to_array {
($container:ident, $size:expr) => {{
if $container.len() != $size {
None
} else {
use std::mem;
let mut arr : [_; $size] = unsafe { mem::uninitialized() };
for element in $container.into_iter().enumerate() {
let old_val = mem::replace(&mut arr[element.0],element.1);
unsafe { mem::forget(old_val) };
}
Some(arr)
}
}};
}
fn array_to_vec(arr: &[u8]) -> Vec<u8> {
let mut vector = Vec::new();
for i in arr.iter() {
vector.push(*i);
}
vector
}
fn vector_as_u8_4_array(vector: Vec<u8>) -> [u8;4] {
let mut arr = [0u8;4];
for i in (0..4) {
arr[i] = vector[i];
}
arr
}
The code seems fine to me, although there's a very important safety thing to note: there can be no panics while arr isn't fully initialised. Running destructors on uninitialised memory could easily lead be undefined behaviour, and, in particular, this means that into_iter and the next method of it should never panic (I believe it is impossible for the enumerate and mem::* parts of the iterator to panic given the constraints of the code).
That said, one can express the replace/forget idiom with a single function: std::ptr::write.
for (idx, element) in $container.into_iter().enumerate() {
ptr::write(&mut arr[idx], element);
}
Although, I would write it as:
for (place, element) in arr.iter_mut().zip($container.into_iter()) {
ptr::write(place, element);
}
Similarly, one can apply some iterator goodness to the u8 specialised versions:
fn array_to_vec(arr: &[u8]) -> Vec<u8> {
arr.iter().cloned().collect()
}
fn vector_as_u8_4_array(vector: Vec<u8>) -> [u8;4] {
let mut arr = [0u8;4];
for (place, element) in arr.iter_mut().zip(vector.iter()) {
*place = *element;
}
arr
}
Although the first is probably better written as arr.to_vec(), and the second as
let mut arr = [0u8; 4];
std::slice::bytes::copy_memory(&vector, &mut arr);
arr
Although that function is unstable currently, and hence only usable on nightly.

How do I automatically clear an attribute in a struct when it is moved?

I have a struct
struct Test {
list: Vec<u64>
}
and method in which I would like to get vector and erase list field to empty Vec
fn get_list(&self) -> Vec<u64> {
let list = Vec::new();
for item in self.list.drain() {
list.push(item);
}
list
}
It there another approach for doing it? Something like autoreinit field on moving value, for example:
fn get_list(&self) -> ???<Vec<u64>> {
self.list
}
Here is the solution, you can test on Rust playground (sadly share button doesn't work for me atm).
use std::mem;
#[derive(Debug)]
struct Test {
list: Vec<u64>
}
impl Test {
fn get_list(&mut self) -> Vec<u64> {
let repl = mem::replace(&mut self.list, Vec::new());
repl
}
}
fn main() {
let mut r = Test {
list : vec![1,2,3]
};
print!("r : {:?} ", r);
print!("replace : {:?} ", r.get_list());
print!("r : {:?} ", r);
}
You just need to run mem::replace(docs) on a mutable value and replace it with a value that will be moved in its place. In this case our destination is self.list and value we are replacing it is a blank Vec.
Things to note:
Field self.list of Test, needs to be taken as &mut self.list.
Previous change implies that self should be mutable as well.
Second parameter of replace is moved. That means it won't be available for further after this call. What this usually means, you either pass it a Vec constructor (e.g. Vec::new()) or clone of value that's replacing.
From #rust IRC
< theme> jiojiajiu, http://doc.rust-lang.org/nightly/std/mem/fn.replace.html

Resources