Multiple mutable borrows with Graph-like data structures - rust

I am trying to write a program that will find the longest path in the graph (i.e. the greatest depth) for a directed graph which is always a rooted or multi-rooted tree.
The specs of the assignment require I use DFS and memoization, but multiple mutable references occur when performing the DFS. Is there any other way to do this?
I thought about using HashMaps instead of internal Graph fields, but it would just produce the same error on mutability of the HashMap. I've found several other questions on the Rust user forum and here, but none of them gives the advise on how to resolve this. Am I supposed to use "unsafe" code or some other strategy?
use std::io;
struct Node {
neighbours: Vec<usize>,
depth: usize,
visited: bool,
}
impl Node {
fn new() -> Node { Node { neighbours: Vec::new(), depth: 0, visited: false } }
fn add_neighbour(&mut self, node: usize) { self.neighbours.push(node); }
fn neighbourhood_size(&self) -> usize { self.neighbours.len() }
}
struct Graph {
nodes: Vec<Node>,
depth: usize,
}
impl Graph {
fn new() -> Graph { Graph { nodes: Vec::new(), depth: 0} }
fn nodes_number(&self) -> usize { self.nodes.len()}
fn add_node(&mut self) { self.nodes.push(Node::new()); }
fn node(&mut self, i: usize) -> &mut Node { &mut self.nodes[i] }
fn dfs(graph: &mut Graph, index: usize) {
if !graph.node(index).visited {
graph.node(index).visited = true;
}
match graph.node(index).neighbourhood_size() == 0 {
true => { graph.node(index).depth = 1; },
false => {
for &i in graph.node(index).neighbours.iter() {
// multiple mutable references
Graph::dfs(graph, i);
}
graph.node(index).depth =
1 + graph.node(index).
neighbours.iter().
map(|&x| graph.node(x).depth).
max().unwrap();
}
}
if graph.node(index).depth > graph.depth {
graph.depth = graph.node(index).depth;
}
}
}
fn main() {
let mut input_line = String::new();
io::stdin().read_line(&mut input_line);
let n = input_line.trim().parse::<usize>().unwrap();
// to avoid counting from 0 or excessive use of (-1)
let mut graph = Graph::new(); graph.add_node();
for _ in 0 .. n {
let mut input_line = String::new();
io::stdin().read_line(&mut input_line);
let separated = input_line.
split(" ").
collect::<Vec<_>>();
let u = separated[0].trim().parse::<usize>().unwrap();
let v = separated[1].trim().parse::<usize>().unwrap();
if graph.nodes_number() <= u { graph.add_node(); }
if graph.nodes_number() <= v { graph.add_node(); }
graph.node(u).add_neighbour(v);
}
let n = graph.nodes_number();
for i in 1 .. n {
if !graph.node(i).visited { Graph::dfs(&mut graph, i); }
}
println!("{}", graph.depth);
}

Instead of taking a copy of the vector before iterating over it, you could also iterate over the indices:
for ni in 0..graph.node(index).neighbours.len() {
let neighbour = graph.node(index).neighbours[ni];
Graph::dfs(graph, neighbour);
}
The neighbours vector gets still borrowed for performing the iteration, but not for the whole course of the iteration:
graph.node(index).neighbours.len(): once at the beginning of the iteration for getting the length
let neighbour = graph.node(index).neighbours[ni];: in each iteration step for getting the neighbour at the current index
Like the copy approach, this solution is based on the constraint that the neighbours vector you are iterating over will not be changed by the call to dfs.
You can solve the remaining issues regarding multiple references in your code by providing immutable access to the graph nodes:
fn node_mut(&mut self, i: usize) -> &mut Node {
&mut self.nodes[i]
}
fn node(&self, i: usize) -> &Node {
&self.nodes[i]
}
Only make use of the mutable access via node_mut where necessary. For example when adding a neighbour: graph.node_mut(u).add_neighbour(v);

You are modifying your graph structure while iterating through a vector contained within it. The compiler has no way of verifying that you do not add or remove from the vector during the iteration, which would invalidate the iterator. This is the intuitive reason for the error.
The easiest way to avoid this is to take a copy of the vector before iterating over it, so the compiler can see that the iterator does not change. This is a little suboptimal but resolves the error for now. Another lifetime error is solved in a similar way (but without much cost) by copying the depth into a variable before doing a comparison.
use std::io;
use std::env;
struct Node {
neighbours: Vec<usize>,
depth: usize,
visited: bool,
}
impl Node {
fn new() -> Node {
Node {
neighbours: Vec::new(),
depth: 0,
visited: false,
}
}
fn add_neighbour(&mut self, node: usize) {
self.neighbours.push(node);
}
fn neighbourhood_size(&self) -> usize {
self.neighbours.len()
}
}
struct Graph {
nodes: Vec<Node>,
depth: usize,
}
impl Graph {
fn new() -> Graph {
Graph {
nodes: Vec::new(),
depth: 0,
}
}
fn nodes_number(&self) -> usize {
self.nodes.len()
}
fn add_node(&mut self) {
self.nodes.push(Node::new());
}
fn node(&mut self, i: usize) -> &mut Node {
&mut self.nodes[i]
}
fn dfs(graph: &mut Graph, index: usize) {
if !graph.node(index).visited {
graph.node(index).visited = true;
}
match graph.node(index).neighbourhood_size() == 0 {
true => {
graph.node(index).depth = 1;
}
false => {
let neighbours = graph.node(index).neighbours.clone();
for &i in neighbours.iter() {
// multiple mutable references
Graph::dfs(graph, i);
}
graph.node(index).depth = 1
+ neighbours
.iter()
.map(|&x| graph.node(x).depth)
.max()
.unwrap();
}
}
let depth = graph.node(index).depth;
if depth > graph.depth {
graph.depth = graph.node(index).depth;
}
}
}
fn main() {
env::set_var("RUST_BACKTRACE", "1");
let mut input_line = String::new();
io::stdin().read_line(&mut input_line);
let n = input_line.trim().parse::<usize>().unwrap();
// to avoid counting from 0 or excessive use of (-1)
let mut graph = Graph::new();
graph.add_node();
for _ in 0..n {
let mut input_line = String::new();
io::stdin().read_line(&mut input_line);
let separated = input_line.split(" ").collect::<Vec<_>>();
let u = separated[0].trim().parse::<usize>().unwrap();
let v = separated[1].trim().parse::<usize>().unwrap();
if graph.nodes_number() <= u {
graph.add_node();
}
if graph.nodes_number() <= v {
graph.add_node();
}
graph.node(u).add_neighbour(v);
}
let n = graph.nodes_number();
for i in 1..n {
if !graph.node(i).visited {
Graph::dfs(&mut graph, i);
}
}
println!("{}", graph.depth);
}
playground
If you were to modify your approach so that you did not mutate the structure during the search (i.e. you stored the visited data elsewhere), the code would work without this copy. This would also be more friendly to concurrent use.

Related

How to traverse and consume a vector in given order? [duplicate]

For example, I have a Vec<String> and an array storing indexes.
let src = vec!["a".to_string(), "b".to_string(), "c".to_string()];
let idx_arr = [2_usize, 0, 1];
The indexes stored in idx_arr comes from the range 0..src.len(), without repetition or omission.
I want to move the elements in src to another container in the given order, until the vector is completely consumed. For example,
let iter = into_iter_in_order(src, &idx_arr);
for s in iter {
// s: String
}
// or
consume_vec_in_order(src, &idx_arr, |s| {
// s: String
});
If the type of src can be changed to Vec<Option<String>>, things will be much easier, just use src[i].take(). However, it cannot.
Edit:
"Another container" refers to any container, such as a queue or hash set. Reordering in place is not the answer to the problem. It introduces the extra time cost of O(n). The ideal method should be 0-cost.
Not sure if my algorithm satisfies your requirements but here I have an algorithm that can consume the provided vector in-order without initializing a new temporary vector, which is more efficient for a memory.
fn main() {
let src = &mut vec!["a".to_string(), "b".to_string(), "c".to_string(), "d".to_string()];
let idx_arr = [2_usize, 3, 1, 0];
consume_vector_in_order(src, idx_arr.to_vec());
println!("{:?}", src); // d , c , a , b
}
// In-place consume vector in order
fn consume_vector_in_order<T>(v: &mut Vec<T>, inds: Vec<usize>) -> &mut Vec<T>
where
T: Default,
{
let mut i: usize = 0;
let mut temp_inds = inds.to_vec();
while i < inds.to_vec().len() {
let s_index = temp_inds[i];
if s_index != i {
let new_index = temp_inds[s_index];
temp_inds.swap(s_index, new_index);
v.swap(s_index, new_index);
} else {
i += 1;
}
}
v
}
You can use the technique found in How to sort a Vec by indices? (using my answer in particular) since that can reorder the data in-place from the indices, and then its just simple iteration:
fn consume_vec_in_order<T>(mut vec: Vec<T>, order: &[usize], mut cb: impl FnMut(T)) {
sort_by_indices(&mut vec, order.to_owned());
for elem in vec {
cb(elem);
}
}
Full example available on the playground.
Edit:
An ideal method, but needs to access unstable features and functions not exposed by the standard library.
use std::alloc::{Allocator, RawVec};
use std::marker::PhantomData;
use std::mem::{self, ManuallyDrop};
use std::ptr::{self, NonNull};
#[inline]
unsafe fn into_iter_in_order<'a, T, A: Allocator>(
vec: Vec<T, A>,
order: &'a [usize],
) -> IntoIter<'a, T, A> {
unsafe {
let mut vec = ManuallyDrop::new(vec);
let cap = vec.capacity();
let alloc = ManuallyDrop::new(ptr::read(vec.allocator()));
let ptr = order.as_ptr();
let end = ptr.add(order.len());
IntoIter {
buf: NonNull::new_unchecked(vec.as_mut_ptr()),
_marker_1: PhantomData,
cap,
alloc,
ptr,
end,
_marker_2: PhantomData,
}
}
}
struct IntoIter<'a, T, A: Allocator> {
buf: NonNull<T>,
_marker_1: PhantomData<T>,
cap: usize,
alloc: ManuallyDrop<A>,
ptr: *const usize,
end: *const usize,
_marker_2: PhantomData<&'a usize>,
}
impl<T, A: Allocator> Iterator for IntoIter<T, A> {
type Item = T;
#[inline]
fn next(&mut self) -> Option<T> {
if self.ptr == self.end {
None
} else {
let idx = unsafe { *self.ptr };
self.ptr = unsafe { self.ptr.add(1) };
if T::IS_ZST {
Some(unsafe { mem::zeroed() })
} else {
Some(unsafe { ptr::read(self.buf.as_ptr().add(idx)) })
}
}
}
}
impl<#[may_dangle] T, A: Allocator> Drop for IntoIter<T, A> {
fn drop(&mut self) {
struct DropGuard<'a, T, A: Allocator>(&'a mut IntoIter<T, A>);
impl<T, A: Allocator> Drop for DropGuard<'_, T, A> {
fn drop(&mut self) {
unsafe {
// `IntoIter::alloc` is not used anymore after this and will be dropped by RawVec
let alloc = ManuallyDrop::take(&mut self.0.alloc);
// RawVec handles deallocation
let _ = RawVec::from_raw_parts_in(self.0.buf.as_ptr(), self.0.cap, alloc);
}
}
}
let guard = DropGuard(self);
// destroy the remaining elements
unsafe {
while self.ptr != self.end {
let idx = *self.ptr;
self.ptr = self.ptr.add(1);
let p = if T::IS_ZST {
self.buf.as_ptr().wrapping_byte_add(idx)
} else {
self.buf.as_ptr().add(idx)
};
ptr::drop_in_place(p);
}
}
// now `guard` will be dropped and do the rest
}
}
Example:
let src = vec![
"0".to_string(),
"1".to_string(),
"2".to_string(),
"3".to_string(),
"4".to_string(),
];
let mut dst = vec![];
let iter = unsafe { into_iter_in_order(src, &[2, 1, 3, 0, 4]) };
for s in iter {
dst.push(s);
}
assert_eq!(dst, vec!["2", "1", "3", "0", "4"]);
My previous answer:
use std::mem;
use std::ptr;
pub unsafe fn consume_vec_in_order<T>(vec: Vec<T>, order: &[usize], mut cb: impl FnMut(T)) {
// Check whether `order` contains all numbers in 0..len without repetition
// or omission.
if cfg!(debug_assertions) {
use std::collections::HashSet;
let n = order.len();
if n != vec.len() {
panic!("The length of `order` is not equal to that of `vec`.");
}
let mut set = HashSet::<usize>::new();
for &idx in order {
if idx >= n {
panic!("`{idx}` in the `order` is out of range (0..{n}).");
} else if set.contains(&idx) {
panic!("`order` contains the repeated element `{idx}`");
} else {
set.insert(idx);
}
}
}
unsafe {
for &idx in order {
let s = ptr::read(vec.get_unchecked(idx));
cb(s);
}
vec.set_len(0);
}
}
Example:
let src = vec![
"0".to_string(),
"1".to_string(),
"2".to_string(),
"3".to_string(),
"4".to_string(),
];
let mut dst = vec![];
consume_vec_in_order(
src,
&[2, 1, 3, 0, 4],
|elem| dst.push(elem),
);
assert_eq!(dst, vec!["2", "1", "3", "0", "4"]);

Rc::downgrade does not seem to drop ownership in a Rust program

I'm trying to create a simple minesweeper in Rust. For this I want to create a Grid object, holding all Case objects. When a Case is clicked, it notifies the Grid.
I want all cases to have a Weak reference to the grid. Why does the borrow checker prevent me from doing so?
Quoting the doc:
Weak is a version of Rc that holds a non-owning reference to the managed allocation.
link marked by the comment should not be owning the grid, yet the compiler tells me the value has been moved.
use std::rc::{Rc, Weak};
pub struct Grid {
dimensions: (usize, usize),
n_mines: usize,
content: Vec<Vec<Case>>,
}
impl Grid {
pub fn new(dimensions: (usize, usize), n_mines: usize) -> Self {
let (n, m) = dimensions;
let mines: Vec<Vec<Case>> = Vec::with_capacity(m);
let mut grid = Self {
dimensions: dimensions,
n_mines: n_mines,
content: mines,
};
let link = Rc::new(grid);
let link = Rc::downgrade(&link); // here link loses the ownership of grid, right ?
println!("{}\n", grid.n_mines);
for i in 0..m {
let mut line: Vec<Case> = Vec::with_capacity(n);
for j in 0..n {
let case = Case::new_with_parent((i, j), Weak::clone(&link));
line.push(case)
}
grid.content.push(line);
}
grid
}
pub fn click_neighbours(&mut self, coordinates: (usize, usize)) {
let surrouding = vec![(0, 0), (0, 1)]; // not a real implementation
for coord in surrouding {
self.content[coord.0][coord.1].click();
}
}
}
pub struct Case {
pub coordinates: (usize, usize),
parent: Weak<Grid>,
}
impl Case {
pub fn new_with_parent(coordinates: (usize, usize), parent: Weak<Grid>) -> Case {
Self {
coordinates: coordinates,
parent: parent,
}
}
pub fn click(&mut self) {
match self.parent.upgrade() {
Some(x) => x.click_neighbours(self.coordinates),
None => println!("whatever, not the issue here"),
}
}
}
fn main() {
let grid = Grid::new((10, 10), 5);
}
I tried multiple variation around enforcement of borrowing rules at runtime by putting the grid content into a RefCell. I'm also currently trying to adapt the Observer design pattern implementation in Rust to my problem, but without any success so far.
There is a misconception here, by moving the grid here:
let link = Rc::new(grid);
you irrevocably destroy grid, nothing you do to link brings back your old grid.
The only thing that could do so would be to use Rc::try_unwrap to unwrap link and assign it back to grid,
but that's not what you want to do here.
Instead you can additionally wrap the grid in a RefCell like so:
let grid = Rc::new(RefCell::new(grid));
Which in then lets you edit the grid inside with borrow_mut:
grid.borrow_mut().content.push(line);
You can see in PitaJ's answer how the full listing of your code should look like.
grid is moved into Rc::new, so you can't use it any longer. You'll have to return that Rc from Grid::new
You can't modify the value within the Rc while you hold Weak references to it. So you'll have to wrap the inner value in a RefCell as well. Otherwise the loop where you create the lines is impossible
Here's a valid version of your code:
use std::cell::RefCell;
use std::rc::{Rc, Weak};
pub struct Grid {
dimensions: (usize, usize),
n_mines: usize,
content: Vec<Vec<Case>>,
}
impl Grid {
pub fn new(dimensions: (usize, usize), n_mines: usize) -> Rc<RefCell<Self>> {
let (n, m) = dimensions;
let mines: Vec<Vec<Case>> = Vec::with_capacity(m);
let grid = Rc::new(RefCell::new(Self {
dimensions: dimensions,
n_mines: n_mines,
content: mines,
}));
let link_weak = Rc::downgrade(&grid);
println!("{}\n", grid.borrow().n_mines);
{
// avoid repeatedly calling `borrow_mut` in the loop
let grid_mut = &mut *grid.borrow_mut();
for i in 0..m {
let mut line: Vec<Case> = Vec::with_capacity(n);
for j in 0..n {
let case = Case::new_with_parent((i, j), Weak::clone(&link_weak));
line.push(case)
}
grid_mut.content.push(line);
}
}
grid
}
pub fn click_neighbours(&mut self, coordinates: (usize, usize)) {
let surrouding = vec![(0, 0), (0, 1)]; // not a real implementation
for coord in surrouding {
self.content[coord.0][coord.1].click();
}
}
}
pub struct Case {
pub coordinates: (usize, usize),
parent: Weak<RefCell<Grid>>,
}
impl Case {
pub fn new_with_parent(coordinates: (usize, usize), parent: Weak<RefCell<Grid>>) -> Case {
Self {
coordinates: coordinates,
parent: parent,
}
}
pub fn click(&mut self) {
match self.parent.upgrade() {
Some(grid) => grid.borrow_mut().click_neighbours(self.coordinates),
None => println!("whatever, not the issue here"),
}
}
}
fn main() {
let grid = Grid::new((10, 10), 5);
}
playground

How to drop a MaybeUninit of vector or array which is partially initialized?

I'm looking for information and good practices for using MaybeUninit
to directly initialize collections (typically arrays or vectors) and
drop them properly if initialization failed.
Thanks to the API examples, I was able to get by fairly quickly with
arrays but it was much trickier with vectors. On the example that
follows (which is a toy simplification of what I did in my project),
generic function, try_new<T: TryFrom<()>, A:ArrayUninit<T>>(len: usize), tries to create an array or a vector of objects T by means
of a fallible data generator TryFrom::try_from(_:()) implemented by
T. The order in which the array is generated is random
(asynchronism); this is simulated by function indices(len:usize).
Function, try_new<A:ArrayUninit>(len: usize), uses method
ArrayUninit::try_uninit(len: usize), implemented by Vec<Data> and
[Data;N], for building uninitialized array or vector.
In our main, we use data type, Data, as example, for which
generator, TryFrom<()> is implemented.
The following code seems to work, but I'm wondering
how to drop uninitialized data:
(playground)
use core::{ time::Duration, mem::MaybeUninit, };
use std::thread;
use rand::prelude::*;
// trait with method for building uninited array/vector
// implementations for Vec<T> and [T;N] after the main()
trait ArrayUninit<T>: AsMut<[T]> + Sized {
fn try_uninit(len: usize) -> Result<MaybeUninit<Self>,String>;
}
// generate shuffled indices
fn indices(len: usize) -> Box<dyn Iterator<Item = usize>> {
let mut vec: Vec<usize> = (0..len).collect();
vec.shuffle(&mut thread_rng());
Box::new(vec.into_iter())
}
// try to build an array or a vector of objects T
fn try_new<T: TryFrom<()>, A:ArrayUninit<T>>(len: usize) -> Result<A,String> {
// build uninitialized collection
let mut uninited = A::try_uninit(len)?;
// simulate initialization in random order
let indices = indices(len);
// build a mutable ref to the array/vector
let ra: &mut A = unsafe {(uninited.as_mut_ptr() as *mut A).as_mut() }.unwrap();
let mut failed = false;
for i in indices {
// get ptr at i
let ptr_arr: * mut T = unsafe{AsMut::<[T]>::as_mut(ra).as_mut_ptr().add(i)};
// get object and break if failed
let data = match T::try_from(()) {
Ok(data) => data, Err(_) => { failed = true; break; },
};
// set object
unsafe { *ptr_arr = data };
}
if !failed {
Ok(unsafe{ uninited.assume_init() }) // return array, if successful
} else {
// if failed, then
for i in 0..len { // drop all objects within array/vector
let ptr_arr: * mut T = unsafe{AsMut::<[T]>::as_mut(ra).as_mut_ptr().add(i)};
drop(unsafe { ptr_arr.read() });
}
drop(uninited); // and drop uninited array/vector
Err(format!("failed to init"))
}
}
// Object Data
#[derive(Debug)]
struct Data(f64);
impl TryFrom<()> for Data {
type Error = ();
// generate a float with errors; time consuming
fn try_from(_:()) -> Result<Self,()> {
thread::sleep(Duration::from_millis(10));
let f = rand::random();
if f <= 0.99 { Ok(Data(f)) } else { Err(()) }
}
}
fn main() {
let result: Result<Vec<Data>,_> = try_new(3);
println!("result: {:?}",result);
let result: Result<[Data;3],_> = try_new(3);
println!("result: {:?}",result);
let result: Result<Vec<Data>,_> = try_new(1000);
println!("result: {:?}",result);
let result: Result<[Data;1000],_> = try_new(1000);
println!("result: {:?}",result);
}
impl<T> ArrayUninit<T> for Vec<T> {
fn try_uninit(len: usize) -> Result<MaybeUninit<Self>,String> {
let mut v: MaybeUninit<Vec<T>> = MaybeUninit::uninit();
let mut vv = Vec::with_capacity(len);
unsafe { vv.set_len(len) };
v.write(vv);
Ok(v)
}
}
impl<T,const N: usize> ArrayUninit<T> for [T;N] {
fn try_uninit(len: usize) -> Result<MaybeUninit<Self>,String> {
if len == N {
Ok(MaybeUninit::uninit())
} else { Err(format!("len differs from array size")) }
}
}
Here is an example of run (results are random):
Standard Error
Compiling playground v0.0.1 (/playground)
Finished dev [unoptimized + debuginfo] target(s) in 0.84s
Running `target/debug/playground`
Standard Output
result: Ok([Data(0.9778296353515407), Data(0.9319034033060891), Data(0.11046580243682291)])
result: Ok([Data(0.749182522350767), Data(0.5432451150541627), Data(0.6840763419767837)])
result: Err("failed to init")
result: Err("failed to init")
For now, in case of failure, I drop all the addresses within the
array/vector, both initialized and uninitialized, then I drop the
array/vector. It seems to work, but I'm surprised that one can also
drop uninitialized data.
Can anyone confirm if this is a right approach to drop the
uninitialized data? If not, what are the rules to follow?
[EDIT:]
Thanks to the remarks of isaactfa and Chayim, I updated the code as follows (playgroud):
use core::{ time::Duration, mem::MaybeUninit, };
use std::thread;
use rand::prelude::*;
// trait with method for building uninited array/vector
// implementations for Vec<T> and [T;N] after the main()
trait ArrayUninit<T>: AsMut<[T]> + Sized {
type Uninited: Sized;
fn try_uninit(len: usize) -> Result<Self::Uninited,String>;
unsafe fn set(uninit: &mut Self::Uninited, i: usize, t: T);
unsafe fn destructor(uninit: &mut Self::Uninited,);
unsafe fn finalize(uninit: Self::Uninited) -> Self;
}
// generate shuffled indices
fn indices(len: usize) -> Box<dyn Iterator<Item = usize>> {
let mut vec: Vec<usize> = (0..len).collect();
vec.shuffle(&mut thread_rng());
Box::new(vec.into_iter())
}
// try to build an array or a vector of objects T
fn try_new<T: TryFrom<()>, A:ArrayUninit<T>>(len: usize) -> Result<A,String> {
// build uninitialized collection
let mut uninited = A::try_uninit(len)?;
// simulate initialization in random order
let indices = indices(len);
let mut failed = false;
for i in indices {
// get object and break if failed
let data = match T::try_from(()) {
Ok(data) => { data }, Err(_) => { failed = true; break; },
};
// set object
unsafe { A::set(&mut uninited,i,data) };
}
if !failed {
Ok(unsafe{ A::finalize(uninited) }) // return array, if successful
} else {
unsafe { A::destructor(&mut uninited) };
Err(format!("failed to init"))
}
}
// Object Data
#[derive(Debug)]
struct Data(String);
impl TryFrom<()> for Data {
type Error = ();
// generate a float with errors; time consuming
fn try_from(_:()) -> Result<Self,()> {
thread::sleep(Duration::from_millis(10));
let f:f32 = rand::random();
if f <= 0.99 { Ok(Data(format!("Value = {}",f))) } else { Err(()) }
}
}
fn main() {
let result: Result<Vec<Data>,_> = try_new(3);
println!("result: {:?}",result);
let result: Result<[Data;3],_> = try_new(3);
println!("result: {:?}",result);
let result: Result<Vec<Data>,_> = try_new(3);
println!("result: {:?}",result);
let result: Result<[Data;3],_> = try_new(3);
println!("result: {:?}",result);
let result: Result<Vec<Data>,_> = try_new(1000);
println!("result: {:?}",result);
let result: Result<[Data;1000],_> = try_new(1000);
println!("result: {:?}",result);
let result: Result<Vec<Data>,_> = try_new(1000);
println!("result: {:?}",result);
let result: Result<[Data;1000],_> = try_new(1000);
println!("result: {:?}",result);
}
impl<T> ArrayUninit<T> for Vec<T> {
type Uninited = (Vec<T>,Vec<bool>);
fn try_uninit(len: usize) -> Result<Self::Uninited,String> {
Ok((Vec::with_capacity(len),vec![false;len]))
}
unsafe fn set((uninit,flag): &mut Self::Uninited, i: usize, t: T) {
uninit.as_mut_ptr().offset(i as isize).write(t); flag[i] = true;
}
unsafe fn destructor((uninit,flag): &mut Self::Uninited,) {
for i in 0..flag.len() {
if flag[i] { std::ptr::drop_in_place(uninit.as_mut_ptr().offset(i as isize)); }
}
}
unsafe fn finalize((mut uninit,flag): Self::Uninited) -> Self {
uninit.set_len(flag.len());
uninit
}
}
impl<T,const N: usize> ArrayUninit<T> for [T;N] {
type Uninited = ([MaybeUninit<T>;N],[bool;N]);
fn try_uninit(len: usize) -> Result<Self::Uninited,String> {
if len == N {
let uninit = unsafe{ MaybeUninit::uninit().assume_init() };
Ok((uninit,[false;N]))
} else { Err(format!("len differs from array size")) }
}
unsafe fn set((uninit,flag): &mut Self::Uninited, i: usize, t: T) {
uninit[i].write(t); flag[i] = true;
}
unsafe fn destructor((uninit,flag): &mut Self::Uninited,) {
for i in 0..N {
if flag[i] { std::ptr::drop_in_place(uninit[i].as_mut_ptr()); }
}
}
unsafe fn finalize((uninit,_): Self::Uninited) -> Self {
(&uninit as *const _ as *const Self).read()
}
}
The idea here is to use specific approaches for arrays and vecs, which are encoded within trait ArrayUninit. MaybeUninit is used only for arrays, while it is not needed for vecs.
Your code contains multiple points of UB:
Calling set_len() when the elements in range are uninitialized (you're doing that in try_uninit() for Vec<T>) is UB (see set_len()'s docs).
When initializing arrays, you create uninitialized storage for the array in try_uninit() and then turns that into a reference to an initialized array in try_new(). This may be undefined behavior (but not necessarily), see https://github.com/rust-lang/unsafe-code-guidelines/issues/84.
When setting the value at the index (unsafe { *ptr_arr = data } in try_new()), you drop the old value. If the value has no drop glue this is likely fine, but if it has this is undefined behavior since your drop uninitialized data. You need to use std::ptr::write() instead.
You're doing a typed copy of the values by drop(unsafe { ptr_arr.read() }). Doing a typed copy of uninitialized values is definitely UB (Miri is even flagging this one).

Lockless processing of non overlapping non contiguous indexes by multiple threads in Rust

I am practicing rust and decided to create a Matrix ops/factorization project.
Basically I want to be able to process the underlying vector in multiple threads. Since I will be providing each thread non-overlapping indexes (which may or may not be contiguous) and the threads will be joined before the end of whatever function created them, there is no need for a lock /synchronization.
I know that there are several crates that can do this, but I would like to know if there is a relatively idiomatic crate-free way to implement it on my own.
The best I could come up with is (simplified the code a bit):
use std::thread;
//This represents the Matrix
#[derive(Debug, Clone)]
pub struct MainStruct {
pub data: Vec<f64>,
}
//This is the bit that will be shared by the threads,
//ideally it should have its lifetime tied to that of MainStruct
//but i have no idea how to make phantomdata work in this case
#[derive(Debug, Clone)]
pub struct SliceTest {
pub data: Vec<SubSlice>,
}
//This struct is to hide *mut f64 to allow it to be shared to other threads
#[derive(Debug, Clone)]
pub struct SubSlice {
pub data: *mut f64,
}
impl MainStruct {
pub fn slice(&mut self) -> (SliceTest, SliceTest) {
let mut out_vec_odd: Vec<SubSlice> = Vec::new();
let mut out_vec_even: Vec<SubSlice> = Vec::new();
unsafe {
let ptr = self.data.as_mut_ptr();
for i in 0..self.data.len() {
let ptr_to_push = ptr.add(i);
//Non contiguous idxs
if i % 2 == 0 {
out_vec_even.push(SubSlice{data:ptr_to_push});
} else {
out_vec_odd.push(SubSlice{data:ptr_to_push});
}
}
}
(SliceTest{data: out_vec_even}, SliceTest{data: out_vec_odd})
}
}
impl SubSlice {
pub fn set(&self, val: f64) {
unsafe {*(self.data) = val;}
}
}
unsafe impl Send for SliceTest {}
unsafe impl Send for SubSlice {}
fn main() {
let mut maindata = MainStruct {
data: vec![0.0, 1.0, 2.0, 3.0, 4.0, 5.0],
};
let (mut outvec1, mut outvec2) = maindata.slice();
let mut threads = Vec::new();
threads.push(
thread::spawn(move || {
for i in 0..outvec1.data.len() {
outvec1.data[i].set(999.9);
}
})
);
threads.push(
thread::spawn(move || {
for i in 0..outvec2.data.len() {
outvec2.data[i].set(999.9);
}
})
);
for handles in threads {
handles.join();
}
println!("maindata = {:?}", maindata.data);
}
EDIT:
Following kmdreko suggestion below, got the code to work exactly how I wanted it without using unsafe code, yay!
Of course in terms of performance it may be cheaper to copy the f64 slices than to create mutable reference vectors unless your struct is filled with other structs instead of f64
extern crate crossbeam;
use crossbeam::thread;
#[derive(Debug, Clone)]
pub struct Matrix {
data: Vec<f64>,
m: usize, //number of rows
n: usize, //number of cols
}
...
impl Matrix {
...
pub fn get_data_mut(&mut self) -> &mut Vec<f64> {
&mut self.data
}
pub fn calculate_idx(max_cols: usize, i: usize, j: usize) -> usize {
let actual_idx = j + max_cols * i;
actual_idx
}
//Get individual mutable references for contiguous indexes (rows)
pub fn get_all_row_slices(&mut self) -> Vec<Vec<&mut f64>> {
let max_cols = self.max_cols();
let max_rows = self.max_rows();
let inner_data = self.get_data_mut().chunks_mut(max_cols);
let mut out_vec: Vec<Vec<&mut f64>> = Vec::with_capacity(max_rows);
for chunk in inner_data {
let row_vec = chunk.iter_mut().collect();
out_vec.push(row_vec);
}
out_vec
}
//Get mutable references for disjoint indexes (columns)
pub fn get_all_col_slices(&mut self) -> Vec<Vec<&mut f64>> {
let max_cols = self.max_cols();
let max_rows = self.max_rows();
let inner_data = self.get_data_mut().chunks_mut(max_cols);
let mut out_vec: Vec<Vec<&mut f64>> = Vec::with_capacity(max_cols);
for _ in 0..max_cols {
out_vec.push(Vec::with_capacity(max_rows));
}
let mut inner_idx = 0;
for chunk in inner_data {
let row_vec_it = chunk.iter_mut();
for elem in row_vec_it {
out_vec[inner_idx].push(elem);
inner_idx += 1;
}
inner_idx = 0;
}
out_vec
}
...
}
fn test_multithreading() {
fn test(in_vec: Vec<&mut f64>) {
for elem in in_vec {
*elem = 33.3;
}
}
fn launch_task(mat: &mut Matrix, f: fn(Vec<&mut f64>)) {
let test_vec = mat.get_all_row_slices();
thread::scope(|s| {
for elem in test_vec.into_iter() {
s.spawn(move |_| {
println!("Spawning thread...");
f(elem);
});
}
}).unwrap();
}
let rows = 4;
let cols = 3;
//new function code omitted, returns Result<Self, MatrixError>
let mut mat = Matrix::new(rows, cols).unwrap()
launch_task(&mut mat, test);
for i in 0..rows {
for j in 0..cols {
//Requires index trait implemented for matrix
assert_eq!(mat[(i, j)], 33.3);
}
}
}
This API is unsound. Since there is no lifetime annotation binding SliceTest and SubSlice to the MainStruct, they can be preserved after the data has been destroyed and if used would result in use-after-free errors.
Its easy to make it safe though; you can use .iter_mut() to get distinct mutable references to your elements:
pub fn slice(&mut self) -> (Vec<&mut f64>, Vec<&mut f64>) {
let mut out_vec_even = vec![];
let mut out_vec_odd = vec![];
for (i, item_ref) in self.data.iter_mut().enumerate() {
if i % 2 == 0 {
out_vec_even.push(item_ref);
} else {
out_vec_odd.push(item_ref);
}
}
(out_vec_even, out_vec_odd)
}
However, this surfaces another problem: thread::spawn cannot hold references to local variables. The threads created are allowed to live beyond the scope they're created in, so even though you did .join() them, you aren't required to. This was a potential issue in your original code as well, just the compiler couldn't warn about it.
There's no easy way to solve this. You'd need to use a non-referential way to use data on the other threads, but that would be using Arc, which doesn't allow mutating its data, so you'd have to resort to a Mutex, which is what you've tried to avoid.
I would suggest reaching for scope from the crossbeam crate, which does allow you to spawn threads that reference local data. I know you've wanted to avoid using crates, but this is the best solution in my opinion.
See a working version on the playground.
See:
How to get multiple mutable references to elements in a Vec?
Can you specify a non-static lifetime for threads?

How to define mutual recursion with closures?

I can do something like this:
fn func() -> (Vec<i32>, Vec<i32>) {
let mut u = vec![0;5];
let mut v = vec![0;5];
fn foo(u: &mut [i32], v: &mut [i32], i: usize, j: usize) {
for k in i+1..u.len() {
u[k] += 1;
bar(u, v, k, j);
}
}
fn bar(u: &mut [i32], v: &mut [i32], i: usize, j: usize) {
for k in j+1..v.len() {
v[k] += 1;
foo(u, v, i, k);
}
}
foo(&mut u, &mut v, 0, 0);
(u,v)
}
fn main() {
let (u,v) = func();
println!("{:?}", u);
println!("{:?}", v);
}
but I would prefer to do something like this:
fn func() -> (Vec<i32>, Vec<i32>) {
let mut u = vec![0;5];
let mut v = vec![0;5];
let foo = |i, j| {
for k in i+1..u.len() {
u[k] += 1;
bar(k, j);
}
};
let bar = |i, j| {
for k in j+1..v.len() {
v[k] += 1;
foo(i, k);
}
};
foo(0, 0);
(u,v)
}
fn main() {
let (u,v) = func();
println!("{:?}", u);
println!("{:?}", v);
}
The second example doesn't compile with the error: unresolved name bar.
In my task I can do it through one recursion, but it will not look clear.
Does anyone have any other suggestions?
I have a solution for mutually recursive closures, but it doesn't work with multiple mutable borrows, so I couldn't extend it to your example.
There is a way to use define mutually recursive closures, using an approach similar to how this answer does single recursion. You can put the closures together into a struct, where each of them takes a borrow of that struct as an extra argument.
fn func(n: u32) -> bool {
struct EvenOdd<'a> {
even: &'a Fn(u32, &EvenOdd<'a>) -> bool,
odd: &'a Fn(u32, &EvenOdd<'a>) -> bool
}
let evenodd = EvenOdd {
even: &|n, evenodd| {
if n == 0 {
true
} else {
(evenodd.odd)(n - 1, evenodd)
}
},
odd: &|n, evenodd| {
if n == 0 {
false
} else {
(evenodd.even)(n - 1, evenodd)
}
}
};
(evenodd.even)(n, &evenodd)
}
fn main() {
println!("{}", func(5));
println!("{}", func(6));
}
While defining mutually recursive closures works in some cases, as demonstrated in the answer by Alex Knauth, I don't think that's an approach you should usually take. It is kind of opaque, has some limitations pointed out in the other answer, and it also has a performance overhead since it uses trait objects and dynamic dispatch at runtime.
Closures in Rust can be thought of as functions with associated structs storing the data you closed over. So a more general solution is to define your own struct storing the data you want to close over, and define methods on that struct instead of closures. For this case, the code could look like this:
pub struct FooBar {
pub u: Vec<i32>,
pub v: Vec<i32>,
}
impl FooBar {
fn new(u: Vec<i32>, v: Vec<i32>) -> Self {
Self { u, v }
}
fn foo(&mut self, i: usize, j: usize) {
for k in i+1..self.u.len() {
self.u[k] += 1;
self.bar(k, j);
}
}
fn bar(&mut self, i: usize, j: usize) {
for k in j+1..self.v.len() {
self.v[k] += 1;
self.foo(i, k);
}
}
}
fn main() {
let mut x = FooBar::new(vec![0;5], vec![0;5]);
x.foo(0, 0);
println!("{:?}", x.u);
println!("{:?}", x.v);
}
(Playground)
While this can get slightly more verbose than closures, and requires a few more explicit type annotations, it's more flexible and easier to read, so I would generally prefer this approach.

Resources