Which Rust structure does this? - rust

I'm sure there's something out there, but I'm missing the keyword for it :)
I want a Vec (or some other similar structure that uses the heap) that can hold only N values, but with the following twist:
If the capacity is full, the next .push() will drop the first value. So, the structure will keep being full, but with the latest pushed value at the bottom.
I can DIY it but I'm new to Rust and so I fear that my implementation won't be neither efficient nor elegant.
Thank you!

The data structure you are looking for is a circular buffer. You can find many implementations on https://crates.io/, but it's also not difficult to roll your own. Here's a minimal implementation that you could use as a starting point:
#[derive(Debug)]
pub struct CircularBuffer<T> {
start: usize,
data: Vec<T>,
}
impl<T> CircularBuffer<T> {
pub fn new(capacity: usize) -> Self {
Self {
start: 0,
data: Vec::with_capacity(capacity),
}
}
pub fn push(&mut self, item: T) {
if self.data.len() < self.data.capacity() {
self.data.push(item);
} else {
self.data[self.start] = item;
self.start += 1;
if self.start == self.data.capacity() {
self.start = 0;
}
}
}
pub fn get(&self, index: usize) -> Option<&T> {
if index >= self.data.len() {
return None;
}
let mut index = index + self.start;
if index > self.data.capacity() {
index -= self.data.capacity()
}
self.data.get(index)
}
}

The standard library's VecDeque is in fact built on a circular buffer a.k.a. ring buffer, with the addition of growability. So, you could use VecDeque while making sure (possibly via a wrapper type) to delete an item before inserting one if it's full, and you'll get implementations of everything else (like iteration, and rearranging it to be contiguous if desired).

Related

Binary tree node with pointer to siblings in Rust

I am trying to figure out the equivalent of the typical setSibling C code exercise:
// Assume the tree is fully balanced, i.e. the lowest level is fully populated.
struct Node {
Node * left;
Node * right;
Node * sibling;
}
void setSibling(Node * root) {
if (!root) return;
if (root->left) {
root->left->sibling = root->right;
if (root->sibling) root->right->sibling = root->sibling->left;
SetSibling(left);
SetSibling(right);
}
}
Of course Rust is a different world, so I am forced to think about ownership now. My lousy attempt.
struct TreeNode<'a> {
left: Option<&'a TreeNode<'a>>,
right: Option<&'a TreeNode<'a>>,
sibling: Option<&'a TreeNode<'a>>,
value: String
}
fn BuildTreeNode<'a>(aLeft: Option<&'a TreeNode<'a>>, aRight: Option<&'a TreeNode<'a>>, aValue: String) -> TreeNode<'a> {
TreeNode {
left: aLeft,
right: aRight,
value: aValue,
sibling: None
}
}
fn SetSibling(node: &mut Option<&TreeNode>) {
match node {
Some(mut n) => {
match n.left {
Some(mut c) => {
//c*.sibling = n.right;
match n.sibling {
Some(s) => { n.right.unwrap().sibling = s.left },
None => {}
}
},
None => {}
}
},
None => return
}
}
What's the canonical way to represent graph nodes like these?
Question: what's the canonical way to represent graph nodes like these?
It seems like a typical case of "confused ownership" once you introduce the sibling link: with a strict tree you can have each parent own its children, but the sibling link means this is a graph, and a given node has multiple owners.
AFAIK there are two main ways to resolve this, at least in safe Rust
reference counting and inner mutability, if each node is reference-counted, the sibling link can be a reference or weak reference with little trouble, the main drawbacks are this requires inner mutability and the navigation is gnarly, though a few utility methods can help
"unfold" the graph into an array, and use indices for your indirection through the tree, the main drawback is this requires either threading or keeping a backreference (with inner mutability) to the array, or alternatively doing everything iteratively
Both basically work around the ownership constraint, one by muddying the ownership itself, and the other by moving ownership to a higher power (the array).
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=68d092d0d86dc32fe07902c832262ef4 seems to be more or less what you're looking using Rc & inner mutability:
use std::cell::RefCell;
use std::rc::{Rc, Weak};
#[derive(Default)]
pub struct TreeNode {
left: Option<Rc<TreeNode>>,
right: Option<Rc<TreeNode>>,
sibling: RefCell<Option<Weak<TreeNode>>>,
v: u8,
}
impl TreeNode {
pub fn new(v: u8) -> Rc<Self> {
Rc::new(TreeNode {
v,
..TreeNode::default()
})
}
pub fn new_with(left: Option<Rc<TreeNode>>, right: Option<Rc<TreeNode>>, v: u8) -> Rc<Self> {
Rc::new(TreeNode {
left,
right,
v,
sibling: RefCell::new(None),
})
}
pub fn set_siblings(self: &Rc<Self>) {
let Some(left) = self.left() else { return };
let right = self.right();
left.sibling.replace(right.map(Rc::downgrade));
if let Some(sibling) = self.sibling() {
// not sure this is correct, depending on construction, with
// 3 5
// \ \
// 2 4
// \/
// 1
// (2) has a sibling (4) but doesn't have a right node, so
// unconditionally setting right.sibling doesn't seem correct
right
.unwrap()
.sibling
.replace(sibling.left().map(Rc::downgrade));
}
left.set_siblings();
right.map(|r| r.set_siblings());
}
pub fn left(&self) -> Option<&Rc<Self>> {
self.left.as_ref()
}
pub fn right(&self) -> Option<&Rc<Self>> {
self.right.as_ref()
}
pub fn sibling(&self) -> Option<Rc<Self>> {
self.sibling.borrow().as_ref()?.upgrade()
}
}
fn main() {
let t = TreeNode::new_with(
TreeNode::new_with(TreeNode::new(1).into(), TreeNode::new(2).into(), 3).into(),
TreeNode::new(4).into(),
5,
);
t.set_siblings();
assert_eq!(t.left().and_then(|l| l.sibling()).unwrap().v, 4);
let ll = t.left().and_then(|l| l.left());
assert_eq!(ll.map(|ll| ll.v), Some(1));
ll.unwrap().sibling().unwrap();
assert_eq!(
t.left()
.and_then(|l| l.left())
.and_then(|ll| ll.sibling())
.unwrap()
.v,
2
);
}
Note that I assumed the tree is immutable once created, only the siblings links have to be generated post-facto. So I only added inner mutability for those. I also used weak pointers which probably isn't necessary, if the tree is put in an inconsistent state it's not like that'll save anything. All it requires is a few upgrade() and downgrade() calls in stead of clone() though so it's not a huge imposition.
That aside, there are lots of issues with your attempt:
having the same lifetime for your reference and your content is usually an error, the compiler will trust what you tell it, and that can have rather odd effects (e.g. of telling the compiler that something gets borrowed forever)
SetSibling (incorrect naming conventions btw) taking an Option is... unnecessary, function clearly expects to set a sibling, just give it a sibling, and remove the unnecessary outer layer of tests
match is nice, when you need it. Here, you probably don't, if let would do the trick fine especially since there is no else branch
rust generally uses methods, and the method to create an instance (if one such is needed) is idiomatically called new (if there's only one)

Implementing depth-first search using a stack

I've come across a problem that requires doing DFS on a tree defined like such:
pub struct TreeNode {
pub val: i32,
pub left: Option<Rc<RefCell<TreeNode>>>,
pub right: Option<Rc<RefCell<TreeNode>>>,
}
I want to use the non-recursive version of the algorithm with an explicit stack. The tree is read-only and the values in the tree are not guaranteed to be unique (they can't be used to identify a node).
Problem is, that the iterative version requires a visited data structure. Normally, in C++, I'd just use an std::set with node pointers for implementing visited. How would I do the same (or analogous) in Rust? There doesn't seem to be an easy way to get a pointer to an object that I can use in a set.
First off, we don't need to keep track of visited if we know there are no circular dependencies. Normally binary trees don't have circular dependencies so we may be able to assume it simply is not an issue. In this case, we can use a VecDeque as our 'stack' queue.
type TreeNodeRef = Rc<RefCell<TreeNode>>;
pub struct TreeNode {
pub val: i32,
pub left: Option<TreeNodeRef>,
pub right: Option<TreeNodeRef>,
}
pub fn dfs(root: TreeNodeRef, target: i32) -> Option<TreeNodeRef> {
let mut queue = VecDeque::new();
queue.push_back(root);
while let Some(node) = queue.pop_front() {
// Check if this is the node we are looking for
if node.borrow().val == target {
return Some(node)
}
// Add left and write to the back of the queue for DFS
let items = node.borrow();
if let Some(left) = &items.left {
queue.push_front(left.clone());
}
if let Some(right) = &items.right {
queue.push_front(right.clone());
}
}
// Search completed and node was not found
None
}
However, if we need to keep a list of visited nodes, we can cheat a little. An Rc<T> is just a boxed value with a reference count so we can extract a pointer from it. Even though we can not compare TreeNodes, we can store where they are kept in memory. When we do that, the solution looks like this:
pub fn dfs(root: TreeNodeRef, target: i32) -> Option<TreeNodeRef> {
let mut visited = HashSet::new();
let mut queue = VecDeque::new();
queue.push_back(root);
while let Some(node) = queue.pop_front() {
// Check node has not been visited yet
if visited.contains(&Rc::as_ptr(&node)) {
continue
}
// Insert node to visited list
visited.insert(Rc::as_ptr(&node));
if node.borrow().val == target {
return Some(node)
}
let items = node.borrow();
if let Some(left) = &items.left {
queue.push_front(left.clone());
}
if let Some(right) = &items.right {
queue.push_front(right.clone());
}
}
None
}
Rust Playground
You may also find it interesting to look at the bottom 2 code examples in this answer to see how a generic search method could be made.
Edit: Alternatively, here is a Rust Playground of how this could be done with a regular Vec and Rc::clone(x) as recommended by #isaactfa.

Returning iterator from weak references for mapping and modifying values

I'm trying quite complex stuff with Rust where I need the following attributes, and am fighting the compiler.
Object which itself lives from start to finish of application, however, where internal maps/vectors could be modified during application lifetime
Multiple references to object that can read internal maps/vectors of an object
All single threaded
Multiple nested iterators which are map/modified in lazy manner to perform fast and complex calculations (see example below)
A small example, which already causes problems:
use std::cell::RefCell;
use std::rc::Rc;
use std::sync::Weak;
pub struct Holder {
array_ref: Weak<RefCell<Vec<isize>>>,
}
impl Holder {
pub fn new(array_ref: Weak<RefCell<Vec<isize>>>) -> Self {
Self { array_ref }
}
fn get_iterator(&self) -> impl Iterator<Item = f64> + '_ {
self.array_ref
.upgrade()
.unwrap()
.borrow()
.iter()
.map(|value| *value as f64 * 2.0)
}
}
get_iterator is just one of the implementations of a trait, but even this example already does not work.
The reason for Weak/Rc is to make sure that multiple places points to object (from point (1)) and other place can modify its internals (Vec<isize>).
What is the best way to approach this situation, given that end goal is performance critical?
EDIT:
Person suggested using https://doc.rust-lang.org/std/cell/struct.Ref.html#method.map
But unfortunately still can't get - if I should also change return type - or maybe the closure function is wrong here
fn get_iterator(&self) -> impl Iterator<Item=f64> + '_ {
let x = self.array_ref.upgrade().unwrap().borrow();
let map1 = Ref::map(x, |x| &x.iter());
let map2 = Ref::map(map1, |iter| &iter.map(|y| *y as f64 * 2.0));
map2
}
IDEA say it has wrong return type
the trait `Iterator` is not implemented for `Ref<'_, Map<std::slice::Iter<'_, isize>, [closure#src/bin/main.rs:30:46: 30:65]>>`
This won't work because self.array_ref.upgrade() creates a local temporary Arc value, but the Ref only borrows from it. Obviously, you can't return a value that borrows from a local.
To make this work you need a second structure to own the Arc, which can implement Iterator in this case since the produced items aren't references:
pub struct HolderIterator(Arc<RefCell<Vec<isize>>>, usize);
impl Iterator for HolderIterator {
type Item = f64;
fn next(&mut self) -> Option<f64> {
let r = self.0.borrow().get(self.1)
.map(|&y| y as f64 * 2.0);
if r.is_some() {
self.1 += 1;
}
r
}
}
// ...
impl Holder {
// ...
fn get_iterator<'a>(&'a self) -> Option<impl Iterator<Item=f64>> {
self.array_ref.upgrade().map(|rc| HolderIterator(rc, 0))
}
}
Alternatively, if you want the iterator to also weakly-reference the value contained within, you can have it hold a Weak instead and upgrade on each next() call. There are performance implications, but this also makes it easier to have get_iterator() be able to return an iterator directly instead of an Option, and the iterator written so that a failed upgrade means the sequence has ended:
pub struct HolderIterator(Weak<RefCell<Vec<isize>>>, usize);
impl Iterator for HolderIterator {
type Item = f64;
fn next(&mut self) -> Option<f64> {
let r = self.0.upgrade()?
.borrow()
.get(self.1)
.map(|&y| y as f64 * 2.0);
if r.is_some() {
self.1 += 1;
}
r
}
}
// ...
impl Holder {
// ...
fn get_iterator<'a>(&'a self) -> impl Iterator<Item=f64> {
HolderIterator(Weak::clone(&self.array_ref), 0)
}
}
This will make it so that you always get an iterator, but it's empty if the Weak is dead. The Weak can also die during iteration, at which point the sequence will abruptly end.

How can I recache data whenever a mutable reference to the source data is dropped?

I have an struct called Spire that contains some data (elements), and a cache of some result that can be calculated from that data. When elements changes, I want to be able to automatically update the cache (e.g. without the user of the struct having to manually call update_height in this case).
I'm trying to figure out how I can achieve that, or if there is a better way to do what I'm trying to do.
struct Spire {
elements: Vec<i32>,
height: i32,
}
impl Spire {
pub fn new(elements: Vec<i32>) -> Spire {
let mut out = Spire {
elements: elements,
height: 0,
};
out.update_height();
out
}
pub fn get_elems_mut(&mut self) -> &mut Vec<i32> {
&mut self.elements
}
pub fn update_height(&mut self) {
self.height = self.elements.iter().sum();
}
pub fn height(&self) -> i32 {
self.height
}
}
fn main() {
let mut spire = Spire::new(vec![1, 2, 3, 1]);
// Get a mutable reference to the internal elements
let spire_elems = spire.get_elems_mut();
// Do some stuff with the elements
spire_elems.pop();
spire_elems.push(7);
spire_elems.push(10);
// The compiler won't allow you to get height
// without dropping the mutable reference first
// dbg!(spire.height());
// When finished, drop the reference to the elements.
drop(spire_elems);
// I want to automatically run update_height() here somehow
dbg!(spire.height());
}
Playground
I am trying to find something with behavior like the Drop trait for mutable references.
There are at least two ways to tackle this problem. Instead of calling drop directly, you should put your code which does the mutation in a new scope so that the scoping rules will automatically be applied to them and drop will be called automatically for you:
fn main() {
let mut spire = Spire::new(vec![1, 2, 3, 1]);
{
let spire_elems = spire.get_elems_mut();
spire_elems.pop();
spire_elems.push(7);
spire_elems.push(10);
}
spire.update_height();
dbg!(spire.height());
}
If you compile this, it will work as expected. Generally speaking, if you have to call drop manually it usually means you are doing something that you shouldn't do.
That being said, the more interesting question is designing an API which is not leaking your abstraction. For example, you could protect your internal data structure representation by providing methods to manipulate it (which has several advantages, one of them is that you can freely change your mind later on what data structure you are using internally without effecting other parts of your code), e.g.
impl Spire {
pub fn push(&mut self, elem: i32) {
self.elements.push(elem);
self.update_internals();
}
}
This example invokes a private method called update_internals which takes care of your internal data consistency after each update.
If you only want to update the internal values when all the additions and removals have happened, then you should implement a finalising method which you have to call every time you finished modifying your Spire instance, e.g.
spire.pop();
spire.push(7);
spire.push(10);
spire.commit();
To achieve such a thing, you have at least another two options: you could do it like the above example or you could use a builder pattern where you are doing modifications throughout a series of calls which will then only have effect when you call the last finalising call on the chain. Something like:
spire.remove_last().add(7).add(10).finalise();
Another approach could be to have an internal flag (a simple bool would do) which is changed to true every time there is an insertion or deletion. Your height method could cache the calculated data internally (e.g. using some Cell type for interior mutability) and if the flag is true then it will recalculate the value and set the flag back to false. It will return the cached value on every subsequent call until you do another modification. Here's a possible implementation:
use std::cell::Cell;
struct Spire {
elements: Vec<i32>,
height: Cell<i32>,
updated: Cell<bool>,
}
impl Spire {
fn calc_height(elements: &[i32]) -> i32 {
elements.iter().sum()
}
pub fn new(elements: Vec<i32>) -> Self {
Self {
height: Cell::new(Self::calc_height(&elements)),
elements,
updated: Cell::new(false),
}
}
pub fn push(&mut self, elem: i32) {
self.updated.set(true);
self.elements.push(elem);
}
pub fn pop(&mut self) -> Option<i32> {
self.updated.set(true);
self.elements.pop()
}
pub fn height(&self) -> i32 {
if self.updated.get() {
self.height.set(Self::calc_height(&self.elements));
self.updated.set(false);
}
self.height.get()
}
}
fn main() {
let mut spire = Spire::new(vec![1, 2, 3, 1]);
spire.pop();
spire.push(7);
spire.push(10);
dbg!(spire.height());
}
If you don't mind borrowing self mutably in the height getter, then don't bother with the Cell, just update the values directly.

Correct way to implement container-element relationship in idiomatic Rust

I know why Rust doesn't like my code. However, I don't know what would be the idiomatic Rust approach to the problem.
I'm a C# programmer, and while I feel I understand Rust's system, I think my "old" approach to some problems don't work in Rust at all.
This code reproduces the problem I'm having, and it probably doesn't look like idiomatic Rust (or maybe it doesn't even look good in C# as well):
//a "global" container for the elements and some extra data
struct Container {
elements: Vec<Element>,
global_contextual_data: i32,
//... more contextual data fields
}
impl Container {
//this just calculates whatever I need based on the contextual data
fn calculate_contextual_data(&self) -> i32 {
//This function will end up using the elements vector and the other fields as well,
//and will do some wacky maths with it.
//That's why I currently have the elements stored in the container
}
}
struct Element {
element_data: i32,
//other fields
}
impl Element {
//I need to take a mutable reference to update element_data,
//and a reference to the container to calculate something that needs
//this global contextual data... including the other elements, as previously stated
fn update_element_data(&mut self, some_data: i32, container: &Container) {
self.element_data *= some_data + container.calculate_contextual_data() //do whatever maths I need
}
}
fn main(){
//let it be mutable so I can assign the elements later
let mut container = Container {
elements: vec![],
global_contextual_data: 1
};
//build a vector of elements
let elements = vec![
Element {
element_data: 5
},
Element {
element_data: 7
}
];
//this works
container.elements = elements;
//and this works, but container is now borrowed as mutable
for elem in container.elements.iter_mut() {
elem.element_data += 1; //and while this works
let some_data = 2;
//i can't borrow it as immutable here and pass to the other function
elem.update_element_data(some_data, &container);
}
}
I understand why elem.update_element_data(some_data, &container); won't work: I'm already borrowing it as mutable when I call iter_mut. Maybe each element should have a reference to the container? But then wouldn't I have more opportunities to break at borrow-checking?
I don't think it's possible to bring my old approach to this new system. Maybe I need to rewrite the whole thing. Can someone point me to the right direction? I've just started programming in Rust, and while the ownership system is making some sort of sense to me, the code I should write "around" it is still not that clear.
I came across this question:
What's the Rust way to modify a structure within nested loops? which gave me insight into my problem.
I revisited the problem and boiled the problem down to the sharing of the vector by borrowing for writes and reads at the same time. This is just forbidden by Rust. I don't want to circumvent the borrow checker using unsafe. I was wondering, though, how much data should I copy?
My Element, which in reality is the entity of a game (I'm simulating a clicker game) has both mutable and immutable properties, which I broke apart.
struct Entity {
type: EntityType,
starting_price: f64,
...
...
status: Cell<EntityStatus>
}
Every time I need to change the status of an entity, I need to call get and set methods on the status field. EntityStatus derives Clone, Copy.
I could even put the fields directly on the struct and have them all be Cells but then it would be cumbersome to work with them (lots of calls to get and set), so I went for the more aesthetically pleasant approach.
By allowing myself to copy the status, edit and set it back, I could borrow the array immutably twice (.iter() instead of .iter_mut()).
I was afraid that the performance would be bad due to the copying, but in reality it was pretty good once I compiled with opt-level=3. If it gets problematic, I might change the fields to be Cells or come up with another approach.
Just do the computation outside:
#[derive(Debug)]
struct Container {
elements: Vec<Element>
}
impl Container {
fn compute(&self) -> i32 {
return 42;
}
fn len(&self) -> usize {
return self.elements.len();
}
fn at_mut(&mut self, index: usize) -> &mut Element {
return &mut self.elements[index];
}
}
#[derive(Debug)]
struct Element {
data: i32
}
impl Element {
fn update(&mut self, data: i32, computed_data: i32) {
self.data *= data + computed_data;
}
}
fn main() {
let mut container = Container {
elements: vec![Element {data: 1}, Element {data: 3}]
};
println!("{:?}", container);
for i in 0..container.len() {
let computed_data = container.compute();
container.at_mut(i).update(2, computed_data);
}
println!("{:?}", container);
}
Another option is to add an update_element to your container:
#[derive(Debug)]
struct Container {
elements: Vec<Element>
}
impl Container {
fn compute(&self) -> i32 {
let sum = self.elements.iter().map(|e| {e.data}).reduce(|a, b| {a + b});
return sum.unwrap_or(0);
}
fn len(&self) -> usize {
return self.elements.len();
}
fn at_mut(&mut self, index: usize) -> &mut Element {
return &mut self.elements[index];
}
fn update_element(&mut self, index: usize, data: i32) {
let computed_data = self.compute();
self.at_mut(index).update(data, computed_data);
}
}
#[derive(Debug)]
struct Element {
data: i32
}
impl Element {
fn update(&mut self, data: i32, computed_data: i32) {
self.data *= data + computed_data;
}
}
fn main() {
let mut container = Container {
elements: vec![Element {data: 1}, Element {data: 3}]
};
println!("{:?}", container);
for i in 0..container.len() {
let computed_data = container.compute();
container.at_mut(i).update(2, computed_data);
}
println!("{:?}", container);
for i in 0..container.len() {
container.update_element(i, 2);
}
println!("{:?}", container);
}
Try it!

Resources