Proptest: Strategy to generate vectors of vectors - rust

I want to generate DAGs with proptest. The algorithm that I pick would be this. I've written the plain algorithm below -- but I need help transforming this to a proptest strategy.
What would a strategy need to look like that did the same as the below code but without using a random number generator? (It goes without saying that random number generators are bad for property-based testing.)
Standard code without proptest strategy:
(https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=2de4a757a96d123bf83b5157e0633d33)
use rand::Rng;
fn main() {
println!("{:?}", random_vec_of_vec());
}
fn random_vec_of_vec() -> Vec<Vec<u16>> {
const N: u16 = 30;
const K: usize = 3;
let mut rng = rand::thread_rng();
let length: u16 = rng.gen_range(0, N);
let mut outer = vec![];
for index in 1..length {
let mut inner = vec![0u16; rng.gen_range(0, K)];
for e in &mut inner {
*e = rng.gen_range(0, index);
}
// De-duplicate elements. Particularly a problem with `index < K`.
inner.sort();
inner.dedup();
outer.push(inner);
}
outer
}
Previous work
I tried using the vec function, but I would need to nest two vec functions. And, the inner vec function could only generate values up to the index in the outer vector.
use proptest::collection::vec;
// INDEX should be the value of the position of the inner vector
// in the outer vector. How could the be found?
let strategy = vec(vec(1..INDEX, 0..K), 0..N);
The index method is not helpful because the right size would still not be known.

One way to go about this, is to replace each rng.gen_range() call with a strategy. Nested strategies must then be connected with prop_flat_map.
In the below code, I replaced my pattern
let length = rng.gen_range(0, N); for i in 1..length { .. }, with a new function vec_from_length(length: usize), which returns a Strategy.
#[cfg(test)]
mod tests {
use super::*;
use proptest::collection::hash_set;
use proptest::prelude::*;
use std::collections::HashSet;
proptest! {
#[test]
fn meaningless_test(v in vec_of_vec()) {
let s = sum(&v); // sum of the sum of all vectors.
prop_assert!(s < 15);
}
}
fn vec_of_vec() -> impl Strategy<Value = Vec<Vec<u16>>> {
const N: u16 = 10;
let length = 0..N;
length.prop_flat_map(vec_from_length).prop_map(convert)
}
fn vec_from_length(length: u16) -> impl Strategy<Value = Vec<HashSet<u16>>> {
const K: usize = 5;
let mut result = vec![];
for index in 1..length {
// Using a hash_set instead of vec because the elements should be unique.
let inner = hash_set(0..index, 0..K);
result.push(inner);
}
result
}
/// Convert Vec<HashSet<T>> to Vec<Vec<T>>
fn convert(input: Vec<HashSet<u16>>) -> Vec<Vec<u16>> {
let mut output = vec![];
for inner in input {
output.push(inner.into_iter().collect())
}
output
}
}
One more thing: An impl Strategy<Value=Vec<T>> can be generated from either the vec function (a strategy of vector) or from a vector of strategies! In the above code, I do this through having result be pushed with hash_set(..) which is a Strategy. The type is thus something like Vec<Strategy<T>> not Strategy<Vec<T>> (pedantic: Strategy is not a type, maybe).

Related

Iterating over a vector gives me a different value than what is inside the vector Rust

I have been using Petgraph recently to make simple graphs with Structs for nodes and custom edges, but I have come across a problem which I am unsure if it comes from the library or Rust.
I have a graph, in which I have multiple nodes, each nodes have a name. I then put all of the index of the node (with type NodeIndex) in a vector, since Petgraph doesn't have a function to give all the nodes from a graph. I want to then create a function that given a string, it returns the index of the node that matches the name.
My problem is that somehow the type in the vector containing the nodes seems to change. I store it as NodeIndex yet the types somehow change by themselves to u32 without me changing anything. Since it changes automatically, I can't pass the values inside Petgraph functions since they require NodeIndex as inputs and not u32.
The code following is what I have so far and the problem arises in the function find_node_index_with_name where the types seem to change even though I pass a vector of NodeIndex as input so when I iterate over it, I should also get NodeIndex back.
use petgraph::adj::NodeIndex;
use petgraph::stable_graph::StableGraph;
use petgraph::dot::Dot;
#[derive(Clone,Debug,Default)]
struct ControlBloc
{
name:String,
value:u32,
}
fn create_bloc(name:String,value:u32) -> ControlBloc
{
ControlBloc
{
name,
value,
}
}
fn find_node_index_with_name(gr:StableGraph<ControlBloc,u32> , nodes:Vec<NodeIndex> , name_search:String) -> Option<NodeIndex>
{
for i in 0..nodes.len()
{
if gr.node_weight(nodes[i]).unwrap().name == name_search
{
return nodes[i];
}
}
return None;
}
fn main() {
let mut graph = StableGraph::<ControlBloc,u32>::new();
let m = create_bloc(String::from("Main"),10);
let b1 = create_bloc(String::from("sub1"),20);
let b2 = create_bloc(String::from("sub2"),30);
let main = graph.add_node(m);
let sub1 = graph.add_node(b1);
let sub2 = graph.add_node(b2);
let all_nodes = vec![main,sub1,sub2];
println!("{:?}",find_node_index_with_name(graph, all_nodes, String::from("Main")));
}
I am a bit stumped as to why the types change.
Thank you for any inputs!
graph.add_node() returns a petgraph::graph::NodeIndex.
But you used petgraph::adj::NodeIndex which appears to be a different type (don't ask me why), thus the type mismatch.
I took the liberty to change a bit your code in order to use references where you used owned values.
use petgraph::graph::NodeIndex; // graph not adj
use petgraph::stable_graph::StableGraph;
#[derive(Clone, Debug, Default)]
struct ControlBloc {
name: String,
value: u32,
}
fn create_bloc(
name: String,
value: u32,
) -> ControlBloc {
ControlBloc { name, value }
}
fn find_node_index_with_name(
gr: &StableGraph<ControlBloc, u32>,
nodes: &[NodeIndex],
name_search: &str,
) -> Option<NodeIndex> {
nodes
.iter()
.map(|n| *n)
.find(|n| gr.node_weight(*n).unwrap().name == name_search)
/*
for i in 0..nodes.len() {
if gr.node_weight(nodes[i]).unwrap().name == name_search {
return Some(nodes[i]);
}
}
None
*/
}
fn main() {
let mut graph = StableGraph::<ControlBloc, u32>::new();
let m = create_bloc(String::from("Main"), 10);
let b1 = create_bloc(String::from("sub1"), 20);
let b2 = create_bloc(String::from("sub2"), 30);
let main = graph.add_node(m);
let sub1 = graph.add_node(b1);
let sub2 = graph.add_node(b2);
let all_nodes = vec![main, sub1, sub2];
for n in ["Main", "sub1", "sub2"] {
println!("{:?}", find_node_index_with_name(&graph, &all_nodes, n));
}
}
/*
Some(NodeIndex(0))
Some(NodeIndex(1))
Some(NodeIndex(2))
*/

How can I create a fixed size array of Strings using constant generics?

I have a function using a constant generic:
fn foo<const S: usize>() -> Vec<[String; S]> {
// Some code
let mut row: [String; S] = Default::default(); //It sucks because of default arrays are specified up to 32 only
// Some code
}
How can I create a fixed size array of Strings in my case? let mut row: [String; S] = ["".to_string(), S]; doesn't work because String doesn't implement the Copy trait.
You can do it with MaybeUninit and unsafe:
use std::mem::MaybeUninit;
fn foo<const S: usize>() -> Vec<[String; S]> {
// Some code
let mut row: [String; S] = unsafe {
let mut result = MaybeUninit::uninit();
let start = result.as_mut_ptr() as *mut String;
for pos in 0 .. S {
// SAFETY: safe because loop ensures `start.add(pos)`
// is always on an array element, of type String
start.add(pos).write(String::new());
}
// SAFETY: safe because loop ensures entire array
// has been manually initialised
result.assume_init()
};
// Some code
todo!()
}
Of course, it might be easier to abstract such logic to your own trait:
use std::mem::MaybeUninit;
trait DefaultArray {
fn default_array() -> Self;
}
impl<T: Default, const S: usize> DefaultArray for [T; S] {
fn default_array() -> Self {
let mut result = MaybeUninit::uninit();
let start = result.as_mut_ptr() as *mut T;
unsafe {
for pos in 0 .. S {
// SAFETY: safe because loop ensures `start.add(pos)`
// is always on an array element, of type T
start.add(pos).write(T::default());
}
// SAFETY: safe because loop ensures entire array
// has been manually initialised
result.assume_init()
}
}
}
(The only reason for using your own trait rather than Default is that implementations of the latter would conflict with those provided in the standard library for arrays of up to 32 elements; I wholly expect the standard library to replace its implementation of Default with something similar to the above once const generics have stabilised).
In which case you would now have:
fn foo<const S: usize>() -> Vec<[String; S]> {
// Some code
let mut row: [String; S] = DefaultArray::default_array();
// Some code
todo!()
}
See it on the Playground.
As of now, there is no way to compile constant generics. As #AlexLarionov said, you can try to use procedural macros, but that approach still has its bugs and limitations.
If you need a generic that has to be a number, you can use the Num crate, or the more verbose std::num.

Rust: Map two Vecs into a third Vec of composite structs

I have three structs:
struct A;
struct B;
struct C {
a: Option<A>,
b: Option<B>
}
Given inputs Vec<A> and Vec<B> and some predicate function, I want to create an output Vec<C>, which is a combination of the elements of the inputs, something like the following:
let aVec: Vec<A> = vec![];
let bVec: Vec<B> = vec![];
let mut cVec: Vec<C> = vec![];
for a in aVec {
if let Some(b) = bVec.into_iter().find(predicate) {
cVec.push(C{a: Some(a), b: Some(b)});
}
}
Is there a way to do this without needing B to be copyable? Both input vectors aren't required after the operation. Also, is this possible without the loop?
You can:
Find the index of the element satisfying predicate. (I would use Iterator::position.)
remove or swap_remove the element at the position obtained by the previous step.
push the previously removed element into result.
In code:
use itertools; // 0.8.2
use itertools::Itertools;
struct A {}
struct B {
n: usize,
}
struct C {
a: Option<A>,
b: Option<B>
}
fn main() {
let aVec: Vec<A> = vec![];
let mut bVec: Vec<B> = vec![];
let mut cVec: Vec<C> = vec![];
for a in aVec {
if let Some(idx) = bVec.iter()
.position(|b| b.n==42)
{
let b = bVec.remove(idx); // or swap_remove if ordering does not need to be preserved
cVec.push(C{a: Some(a), b: Some(b)});
}
}
}

How to I explicitly assign a type to an `Option<usize>` that is used in a `match`?

I'm making a (probably bad) sorting algorithm as a practice experiment.
I'm trying to take an unsorted list of i32 with duplicates, break it out into an array of sorted arrays (of various sizes) which I can then efficiently recombine into a single fully sorted array. The recombination isn't implemented yet.
mod sort {
use std::collections::VecDeque;
pub fn sort_i32(unsorted_list: &Vec<i32>) { // -> Vec<i32> {
let mut sorting = Vec::with_capacity(unsorted_list.len());
let mut sorting_index = None;
// let mut index: usize = 0;
for number in unsorted_list {
match sorting_index { // sorting_index: Option<usize>
Some(index) => { // index: usize // index<usize>
// let index: usize = index as usize;
if number >= sorting[index].front() {
sorting[index].push_front(number);
} else if number <= sorting[index].back() {
sorting[index].push_back(number);
} else {
let index = index + 1; //index: usize
sorting_index = Some(index);
sorting[index] = VecDeque::with_capacity(unsorted_list.len());
sorting[index].push_front(number);
}
}
None => {
// have to initialize here because we need the first `number` to do so
let index = 0; //index: usize
sorting_index = Some(index);
sorting[index] = VecDeque::with_capacity(unsorted_list.len());
sorting[index].push_front(number);
}
}
}
}
}
I think I need to explicitly tell the compiler that index will be usize as it's going to be an index into a vector:
error[E0282]: type annotations needed
--> src/main.rs:12:34
|
12 | if number >= sorting[index].front() {
| ^^^^^^^^^^^^^^ cannot infer type for `_`
Normal typing syntax doesn't seem to work; as you can see from the comments I've tried a few approaches already.
The first thing to do is to type sorting and sorting_index:
let mut sorting: Vec<VecDeque<i32>> = Vec::with_capacity(unsorted_list.len());
let mut sorting_index: Option<usize> = None;
That will lead to a number of adjustments and decisions at various sites in the code where sorting[index] is accessed. I can add more, but after that it's mostly a matter of working through the compiler errors and making decisions on how you want to handle those cases in your algorithm.

Using the same iterator multiple times in Rust

Editor's note: This code example is from a version of Rust prior to 1.0 when many iterators implemented Copy. Updated versions of this code produce a different errors, but the answers still contain valuable information.
I'm trying to write a function to split a string into clumps of letters and numbers; for example, "test123test" would turn into [ "test", "123", "test" ]. Here's my attempt so far:
pub fn split(input: &str) -> Vec<String> {
let mut bits: Vec<String> = vec![];
let mut iter = input.chars().peekable();
loop {
match iter.peek() {
None => return bits,
Some(c) => if c.is_digit() {
bits.push(iter.take_while(|c| c.is_digit()).collect());
} else {
bits.push(iter.take_while(|c| !c.is_digit()).collect());
}
}
}
return bits;
}
However, this doesn't work, looping forever. It seems that it is using a clone of iter each time I call take_while, starting from the same position over and over again. I would like it to use the same iter each time, advancing the same iterator over all the each_times. Is this possible?
As you identified, each take_while call is duplicating iter, since take_while takes self and the Peekable chars iterator is Copy. (Only true before Rust 1.0 — editor)
You want to be modifying the iterator each time, that is, for take_while to be operating on an &mut to your iterator. Which is exactly what the .by_ref adaptor is for:
pub fn split(input: &str) -> Vec<String> {
let mut bits: Vec<String> = vec![];
let mut iter = input.chars().peekable();
loop {
match iter.peek().map(|c| *c) {
None => return bits,
Some(c) => if c.is_digit(10) {
bits.push(iter.by_ref().take_while(|c| c.is_digit(10)).collect());
} else {
bits.push(iter.by_ref().take_while(|c| !c.is_digit(10)).collect());
},
}
}
}
fn main() {
println!("{:?}", split("123abc456def"))
}
Prints
["123", "bc", "56", "ef"]
However, I imagine this is not correct.
I would actually recommend writing this as a normal for loop, using the char_indices iterator:
pub fn split(input: &str) -> Vec<String> {
let mut bits: Vec<String> = vec![];
if input.is_empty() {
return bits;
}
let mut is_digit = input.chars().next().unwrap().is_digit(10);
let mut start = 0;
for (i, c) in input.char_indices() {
let this_is_digit = c.is_digit(10);
if is_digit != this_is_digit {
bits.push(input[start..i].to_string());
is_digit = this_is_digit;
start = i;
}
}
bits.push(input[start..].to_string());
bits
}
This form also allows for doing this with much fewer allocations (that is, the Strings are not required), because each returned value is just a slice into the input, and we can use lifetimes to state this:
pub fn split<'a>(input: &'a str) -> Vec<&'a str> {
let mut bits = vec![];
if input.is_empty() {
return bits;
}
let mut is_digit = input.chars().next().unwrap().is_digit(10);
let mut start = 0;
for (i, c) in input.char_indices() {
let this_is_digit = c.is_digit(10);
if is_digit != this_is_digit {
bits.push(&input[start..i]);
is_digit = this_is_digit;
start = i;
}
}
bits.push(&input[start..]);
bits
}
All that changed was the type signature, removing the Vec<String> type hint and the .to_string calls.
One could even write an iterator like this, to avoid having to allocate the Vec. Something like fn split<'a>(input: &'a str) -> Splits<'a> { /* construct a Splits */ } where Splits is a struct that implements Iterator<&'a str>.
take_while takes self by value: it consumes the iterator. Before Rust 1.0 it also was unfortunately able to be implicitly copied, leading to the surprising behaviour that you are observing.
You cannot use take_while for what you are wanting for these reasons. You will need to manually unroll your take_while invocations.
Here is one of many possible ways of dealing with this:
pub fn split(input: &str) -> Vec<String> {
let mut bits: Vec<String> = vec![];
let mut iter = input.chars().peekable();
loop {
let seeking_digits = match iter.peek() {
None => return bits,
Some(c) => c.is_digit(10),
};
if seeking_digits {
bits.push(take_while(&mut iter, |c| c.is_digit(10)));
} else {
bits.push(take_while(&mut iter, |c| !c.is_digit(10)));
}
}
}
fn take_while<I, F>(iter: &mut std::iter::Peekable<I>, predicate: F) -> String
where
I: Iterator<Item = char>,
F: Fn(&char) -> bool,
{
let mut out = String::new();
loop {
match iter.peek() {
Some(c) if predicate(c) => out.push(*c),
_ => return out,
}
let _ = iter.next();
}
}
fn main() {
println!("{:?}", split("test123test"));
}
This yields a solution with two levels of looping; another valid approach would be to model it as a state machine one level deep only. Ask if you aren’t sure what I mean and I’ll demonstrate.

Resources