into_shape after remove_index fails

into_shape after remove_index fails - rust

I have an &[f64] with 309.760 elements. This dataset is an array of 7040 property sets. Each property set contains a pair of f64s and 14 triplets of f64.
I am only interested in the triplets.
I can read this dataset into an ndarray like this:
let array = Array::from_iter(data);
let mut propertysets = vector.into_shape(IxDyn(&[7040, 44])).unwrap();
and I can remove the first two f64 of each property set like this:
propertysets.remove_index(Axis(1), 0);
propertysets.remove_index(Axis(1), 0);
println!("{:?}", propertysets.shape()); // [7040, 42]
which looks promising. But now I want to reshape the array into [7040, 14, 3], which should work because 3 * 14 = 42, but:
let result = propertysets.into_shape(IxDyn(&[7040, 14, 3])).unwrap();
panics with this message:
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: ShapeError/IncompatibleLayout: incompatible memory layout'
The documentation of remove_index says:
the elements are not deinitialized or dropped by this, just moved out of view
which probably explains why this fails. But how to do it right? Do I have to copy propertysets somehow into a new ndarray of the correct shape? But how?
Using Array::from_iter(propertysets.iter()) results in an ndarray of &f64 instead of f64.

For the into_shape operation to work arrays have to be c-contiguous or fortran-contigous in memory (see docs). After
vector.into_shape(IxDyn(&[7040, 44])).unwrap();
they are contiguous. But after
propertysets.remove_index(Axis(1), 0);
they are not. Why? The whole array is not moved with remove_index. The elements are just moved out of view (see docs).
How to solve this?
reassemble using from_shape_vec
use the new to_shape func. (not it the docs yet, but here is some info and lots of examples here)
Example:
use ndarray::{Array, Order, ViewRepr};
use ndarray::IxDyn;
use ndarray::Axis;
use ndarray::ArrayBase;
use ndarray::CowRepr;
use ndarray::Dim;
use ndarray::OwnedRepr;
fn into_shape_reassemble(data: Vec<f64>) -> Array<f64, IxDyn>
{
let array = Array::from_iter(data);
let mut result = array.into_shape(IxDyn(&[7040, 44])).unwrap();
result.remove_index(Axis(1), 0);
result.remove_index(Axis(1), 0);
let result = Array::from_shape_vec((7040, 42), result.iter().cloned().collect()).unwrap();
let result = result.into_shape(IxDyn(&[7040, 14, 3])).unwrap();
println!("{:?}", result.shape());
result
}
fn to_shape(data: Vec<f64>) -> ArrayBase<OwnedRepr<f64>, IxDyn>
{
let array = Array::from_iter(data);
let mut result = array.into_shape(IxDyn(&[7040, 44])).unwrap();
result.remove_index(Axis(1), 0);
result.remove_index(Axis(1), 0);
let result = result.to_shape((7040, 14, 3)).unwrap().to_owned();
println!("{:?}", result.shape());
result.into_dyn()
}
#[cfg(test)]
mod tests {
#[test]
fn test_into_shape() {
let data = vec![0.; 7040 * 44];
super::into_shape_reassemble(data);
}
#[test]
fn test_to_shape() {
let data = vec![0.; 7040 * 44];
super::to_shape(data);
}
}
Output:
[7040, 14, 3]
[7040, 14, 3]

Related

How to let several threads write to the same variable without mutex in Rust?

I am trying to implement an outer function that could calculate the outer product of two 1D arrays. Something like this:
use std::thread;
use ndarray::prelude::*;
pub fn multithread_outer(A: &Array1<f64>, B: &Array1<f64>) -> Array2<f64> {
let mut result = Array2::<f64>::default((A.len(), B.len()));
let thread_num = 5;
let n = A.len() / thread_num;
// a & b are ArcArray2<f64>
let a = A.to_owned().into_shared();
let b = B.to_owned().into_shared();
for i in 0..thread_num{
let a = a.clone();
let b = b.clone();
thread::spawn(move || {
for j in i * n..(i + 1) * n {
for k in 0..b.len() {
// This is the line I want to change
result[[j, k]] = a[j] * b[k];
}
}
});
}
// Use join to make sure all threads finish here
// Not so related to this question, so I didn't put it here
result
}
You can see that by design, two threads will never write to the same element. However, rust compiler will not allow two mutable references to the same result variable. And using mutex will make this much slower. What is the right way to implement this function?

While it is possible to do manually (with thread::scope and split_at_mut, for example), ndarray already has parallel iteration integrated into its library, based on rayon:
https://docs.rs/ndarray/latest/ndarray/parallel
Here is how your code would look like with parallel iterators:
use ndarray::parallel::prelude::*;
use ndarray::prelude::*;
pub fn multithread_outer(a: &Array1<f64>, b: &Array1<f64>) -> Array2<f64> {
let mut result = Array2::<f64>::default((a.len(), b.len()));
result
.axis_iter_mut(Axis(0))
.into_par_iter()
.enumerate()
.for_each(|(row_id, mut row)| {
for (col_id, cell) in row.iter_mut().enumerate() {
*cell = a[row_id] * b[col_id];
}
});
result
}
fn main() {
let a = Array1::from_vec(vec![1., 2., 3.]);
let b = Array1::from_vec(vec![4., 5., 6., 7.]);
let c = multithread_outer(&a, &b);
println!("{}", c)
}
[[4, 5, 6, 7],
[8, 10, 12, 14],
[12, 15, 18, 21]]

How to obtain the chunk index in Rayon's par_chunks_mut

I have some data and I want to process it and use it to fill an array that already exists. For example suppose I want to repeat each value 4 times (playground):
use rayon::prelude::*; // 1.3.0
fn main() {
let input = vec![4, 7, 2, 3, 5, 8];
// This already exists.
let mut output = vec![0; input.len() * 4];
output.par_chunks_mut(4).for_each(|slice| {
for x in slice.iter_mut() {
*x = input[?];
}
});
}
This almost works but Rayon doesn't pass the chunk index to me so I can't put anything in input[?]. Is there an efficient solution?

The easiest thing to do is avoid the need for an index at all. For this example, we can just zip the iterators:
use rayon::prelude::*; // 1.3.0
fn main() {
let input = vec![4, 7, 2, 3, 5, 8];
let mut output = vec![0; input.len() * 4];
// Can also use `.zip(&input)` if you don't want to give up ownership
output.par_chunks_mut(4).zip(input).for_each(|(o, i)| {
for o in o {
*o = i
}
});
println!("{:?}", output)
}
For traditional iterators, this style of implementation is beneficial as it avoids unneeded bounds checks which would otherwise be handled by the iterator. I'm not sure that Rayon benefits from the exact same thing, but I also don't see any reason it wouldn't.

Rayon provides an enumerate() function for most of its iterators that works just like the non-parallel counterpart:
let input = vec![4, 7, 2, 3, 5, 8];
let mut output = vec![0; input.len() * 4];
output.par_chunks_mut(4).enumerate().for_each(|(i, slice)| {
for x in slice.iter_mut() {
*x = input[i];
}
});

Updating mutable HashMap in a while loop

I am trying to implement Karger's algorithm in Rust and I have run into an issue while trying to update a mutable hashmap in a while loop.
The map is updated successfully, but then in the next iteration when it was cloned, the values that were updated don't seem to have been changed. However removing elements from the map is reflected on later iterations.
I have tried debugging and printing the values of the map, but the sequence of events doesn't make sense to me.
use itertools::Itertools; // 0.8.0
use rand::seq::{IteratorRandom, SliceRandom}; // 0.6.5
use std::collections::HashMap;
fn contract_edge(graph: &mut HashMap<i32, Vec<i32>>, num_trials: i32) {
let mut count = 0;
while graph.len() > 2 && count < num_trials {
// clone graph so I can mutate graph later
let imut_graph = graph.clone();
// choose random node
let from_value = imut_graph
.keys()
.choose(&mut rand::thread_rng())
.unwrap()
.clone();
let values = imut_graph.get(&from_value);
let to_value = values
.unwrap()
.choose(&mut rand::thread_rng())
.unwrap()
.clone();
let from_edges = imut_graph[&from_value].iter().clone();
// accessing to_value in imut_graph gives error here later
let to_edges = imut_graph[&to_value]
.iter()
.clone()
.filter(|&x| *x != from_value && *x != to_value);
let new_edges = from_edges.chain(to_edges);
// since I am mutating the graph I thought the next time is is clone it would be updated?
graph.insert(from_value, new_edges.map(|v| v.clone()).collect());
graph.remove(&to_value);
for (_key, val) in graph.iter_mut() {
*val = val
.iter()
.map(|v| if v == &to_value { &from_value } else { v })
.unique()
.cloned()
.collect();
}
count += 1;
}
}
When I try to access the map, I get element not found error, but the keys which have been removed should not exist in the vector values at that point.
I am convinced this is something I don't understand about (Im)mutability in Rust.

I'm not really sure what you are trying to achieve here, but based on what I can see above, that is, you'd like to mutate your original graph (because you are passing that as a mutable borrow to your function) and that you don't have a return value, and that your question is about mutating a hashmap -- I assume that you'd like the changes to be reflected in your original HashMap. So why are you cloning it in the first place then?
If on the other hand you don't want to mutate your original object, then don't pass it in as a mutable borrow, but as an immutable one. Then create a clone of it before you start the loop and use that cloned version throughout your algorithm.
The problem you are facing with is happening because on every iteration you are cloning the original graph and not your cloned imut_graph, i.e. on every iteration you create a new HashMap, which then you are mutating, then you start a new cycle and you are still checking the length of the original one and then you clone the original one again.
So the two options you have are:
use std::collections::HashMap;
fn mutated(map: &mut HashMap<i32, Vec<i32>>) {
map.insert(1, vec![4, 5, 6]);
}
fn cloned(map: &HashMap<i32, Vec<i32>>) -> HashMap<i32, Vec<i32>> {
let mut map = map.clone();
map.insert(2, vec![7, 8, 9]);
map
}
fn main() {
let mut map = HashMap::new();
map.insert(0, vec![1, 2, 3]);
println!("{:?}", cloned(&map));
mutated(&mut map);
println!("{:?}", map);
}
Which will give you:
{0: [1, 2, 3], 2: [7, 8, 9]}
{0: [1, 2, 3], 1: [4, 5, 6]}

How to define a function in Rust?

I decided to do 2D vector cross product in Rust. In JavaScript, this is simple to do:
float CrossProduct( const Vec2& a, const Vec2& b ) {
return a.x * b.y - a.y * b.x;
}
I tried to convert it to the Rust system:
// Just created two separate variables for the two different vectors
let vec1 = vec![1.15, 7.0];
let vec2 = vec![7.0, 2.0];
let cross_product(&vec1, &vec2) = vec1[0] * vec2[1] - vec1[1] * vec2[0];
println!("{}", cross_product);
// I also tried return.
let vec1 = vec![1.15, 7.0];
let vec2 = vec![7.0, 2.0];
let cross_product(&vec1, &vec2) {
return (vec1[0] * vec2[1] - vec1[1] * vec2[0]);
}
println!("{}", cross_product);
I thought that one of these would work, however this was more of a reality check to me how different Rust can be from any language I have used previously.
I found a very inefficient way to work around this, however I would rather learn to do this correctly. I am new to Rust, so please take my attempts with a grain of salt.

There are two possible ways to do this.
First Way
You can declare a function and pass it into println!() which is similar to many programming languages like Java, C#, etc.
// Declare the function
fn cross_product(slice1: &[i32], slice2: &[i32]) -> i32 {
slice1[0] * slice2[1] - slice1[1] * slice2[2]
}
// Use it Like following
fn main() {
let vec1 = vec![1, 2, 3];
let vec2 = vec![4, 5, 6];
println!("{}", cross_product(&vec1[..], &vec2[..]));
}
Second Way
You can declare a closure and pass it into println!(), a common methodology in functional programming:
// You can declare a closure and use it as function in the same code block
fn main() {
let vec1 = vec![1, 2, 3];
let vec2 = vec![4, 5, 6];
let cross_product = |slice1: &[i32], slice2: &[i32]| -> i32 {
let result = slice1[0] * slice2[1] - slice1[1] * slice2[2];
result
};
println!("{}", cross_product(&vec1[..], &vec2[..]));
}
Please note that I have created the vectors and closures using the i32 data type, which corresponds to an integer. You can change the type with f32 or if you want wider float range f64.

It looks like you are mainly having problems with Rust syntax. You can either create a cross product function or do the cross product inline.
let vec1 = vec![1.15, 7.0];
let vec2 = vec![7.0, 2.0];
let cross_product = vec1[0] * vec2[1] - vec1[1] * vec2[0];
println!("{}", cross_product);
If you want a function you can use continually.
fn function_cross_product(vec1: Vec<f64>, vec2: Vec<f64>) -> f64 {
return vec1[0] * vec2[1] - vec1[1] * vec2[0];
};
let other_product = function_cross_product(vec1, vec2);
println!("{}", other_product);
The second solution can be misleading because it will always produce the cross product for a 2x2 vector even if you pass different sized vectors.

Explicit partial array initialisation in Rust

In C, I can write int foo[100] = { 7, 8 }; and I will get [7, 8, 0, 0, 0...].
This allows me to explicitly and concisely choose initial values for a contiguous group of elements at the beginning of the array, and the remainder will be initialised as if they had static storage duration (i.e. to the zero value of the appropriate type).
Is there an equivalent in Rust?

To the best of my knowledge, there is no such shortcut. You do have a few options, though.
The direct syntax
The direct syntax to initialize an array works with Copy types (integers are Copy):
let array = [0; 1024];
initializes an array of 1024 elements with all 0s.
Based on this, you can afterwards modify the array:
let array = {
let mut array = [0; 1024];
array[0] = 7;
array[1] = 8;
array
};
Note the trick of using a block expression to isolate the mutability to a smaller section of the code; we'll reuse it below.
The iterator syntax
There is also support to initialize an array from an iterator:
let array = {
let mut array = [0; 1024];
for (i, element) in array.iter_mut().enumerate().take(2) {
*element = (i + 7);
}
array
};
And you can even (optionally) start from an uninitialized state, using an unsafe block:
let array = unsafe {
// Create an uninitialized array.
let mut array: [i32; 10] = mem::uninitialized();
let nonzero = 2;
for (i, element) in array.iter_mut().enumerate().take(nonzero) {
// Overwrite `element` without running the destructor of the old value.
ptr::write(element, i + 7)
}
for element in array.iter_mut().skip(nonzero) {
// Overwrite `element` without running the destructor of the old value.
ptr::write(element, 0)
}
array
};
The shorter iterator syntax
There is a shorter form, based on clone_from_slice, it is currently unstable however.
#![feature(clone_from_slice)]
let array = {
let mut array = [0; 32];
// Override beginning of array
array.clone_from_slice(&[7, 8]);
array
};

Here is macro
macro_rules! array {
($($v:expr),*) => (
{
let mut array = Default::default();
{
let mut e = <_ as ::std::convert::AsMut<[_]>>::as_mut(&mut array).iter_mut();
$(*e.next().unwrap() = $v);*;
}
array
}
)
}
fn main() {
let a: [usize; 5] = array!(7, 8);
assert_eq!([7, 8, 0, 0, 0], a);
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

into_shape after remove_index fails - rust

Related

How to let several threads write to the same variable without mutex in Rust?

How to obtain the chunk index in Rayon's par_chunks_mut

Updating mutable HashMap in a while loop

How to define a function in Rust?

Explicit partial array initialisation in Rust

Categories

Resources