nested looping prints strange results - rust

Was trying to check a 2d vector and noticed some weird behavior that I haven't been able to find a explanation for.
fn main() {
let mut main_loop = vec![];
let mut x = 0;
let board_x = 10;
let mut y = 0;
let board_y = 10;
while y < board_y {
let mut row = vec![];
while x < board_x {
let v = random::<bool>();
println!("x:{}", x);
row.push(v);
x = x + 1;
}
println!("y:{}", y);
main_loop.push(row);
y = y + 1;
}
}
This only prints
x:0
x:1
x:2
x:3
x:4
x:5
x:6
x:7
x:8
x:9
y:0
y:1
y:2
y:3
y:4
y:5
y:6
y:7
y:8
y:9
Shouldn't this be printing out x:1 - x:10 10 times? Also what was even stranger was that when I looped back over the vectors using a nested for loops for each row, it counted out 60 indexes before exiting out. What am I missing?

Shouldn't this be printing out x:1 - x:10 10 times?
Why would it? You never set x back to zero after the inner loop is done.
Besides which, this sort of iteration should be done with a for loop:
for x in 0..10 { ... }
[...] when I looped back over the vectors using a nested for loops for each row, it counted out 60 indexes before exiting out.
I don't know; you didn't post that code, and I can't see any reason it would do that.

Related

Multiply numbers from two iterators in order and without duplicates

I have this code and I want every combination to be multiplied:
fn main() {
let min = 1;
let max = 9;
for i in (min..=max).rev() {
for j in (min..=max).rev() {
println!("{}", i * j);
}
}
}
Result is something like:
81
72
[...]
9
72
65
[...]
8
6
4
2
9
8
7
6
5
4
3
2
1
Is there a clever way to produce the results in descending order (without collecting and sorting) and without duplicates?
Note that this answer provides a solution for this specific problem (multiplication table) but the title asks a more general question (any two iterators).
The naive solution of storing all elements in a vector and then sorting it uses O(n^2 log n) time and O(n^2) space (where n is the size of the multiplication table).
You can use a priority queue to reduce the memory to O(n):
use std::collections::BinaryHeap;
fn main() {
let n = 9;
let mut heap = BinaryHeap::new();
for j in 1..=n {
heap.push((9 * j, j));
}
let mut last = n * n + 1;
while let Some((val, j)) = heap.pop() {
if val < last {
println!("{val}");
last = val;
}
if val > j {
heap.push((val - j, j));
}
}
}
playground.
The conceptual idea behind the algorithm is to consider 9 separate sequences
9*9, 9*8, 9*7, .., 9*1
8*9, 8*8, 8*7, .., 8*1
...
1*9, 1*8, 1*7, .., 1*1
Since they are all decreasing, at a given moment, we only need to consider one element of each sequence (the largest one we haven't reached yet).
These are inserted into the priority queue which allows us to efficiently find the maximum one.
Once we have printed a given element we move onto the next one in the sequence and insert that into the priority queue.
By keeping track of the last element printed we can avoid duplicates.

efficiently creating a list of pointers to a character in a buffer using arm neon simd

I've been rewriting some performance sensitive parts of my code to aarch64 neon. For some things, like population count, i've managed to get a 12x speed. But for some algorithms i'm having trouble..
The high level problem is quickly adding a list of newline separated strings to a hashset. Assuming the hashset functionality is optimal (I am looking into it next), first i need to scan for the strings in the buffer.
I have tried various techniques - but my intuition tells me that I can create a list of pointers to each newline, and then insert them into the hashset afterwards now that i have the slices.
The fundamental problem is I can't work out an efficient way to load a vector, compare against the newline, and spit out a list of pointers to the newlines. eg. the output is a variable length, depending on how many newlines were found in the input vector.
Here is my approach;
fn read_file7(mut buffer: Vec<u8>, needle: u8) -> Result<HashSet<Vec<u8>>, Error>
{
let mut set = HashSet::new();
let mut chunk_offset: usize = 0;
let special_finder_big = [
0x80u8, 0x40u8, 0x20u8, 0x10u8, 0x08u8, 0x04u8, 0x02u8, 0x01u8, // high
0x80u8, 0x40u8, 0x20u8, 0x10u8, 0x08u8, 0x04u8, 0x02u8, 0x01u8, // low
];
let mut next_start: usize = 0;
let needle_vector = unsafe { vdupq_n_u8(needle) };
let special_finder_big = unsafe { vld1q_u8(special_finder_big.as_ptr()) };
let mut line_counter = 0;
// we process 16 chars at a time
for chunk in buffer.chunks(16) {
unsafe {
let src = vld1q_u8(chunk.as_ptr());
let out = vceqq_u8(src, needle_vector);
let anded = vandq_u8(out, special_finder_big);
// each of these is a bitset of each matching character
let vadded = vaddv_u8(vget_low_u8(anded));
let vadded2 = vaddv_u8(vget_high_u8(anded));
let list = [vadded2, vadded];
// combine bitsets into one big one!
let mut num = std::mem::transmute::<[u8; 2], u16>(list);
// while our bitset has bits left, find the set bits
while num > 0 {
let mut xor = 0x8000u16; // only set the highest bit
let clz = (num).leading_zeros() as usize;
set.get_or_insert_owned(&buffer[(next_start)..(chunk_offset + clz)]);
// println!("found '{}' at {} | clz is {} ", needle.escape_ascii(), start_offset + clz, clz);
// println!("string is '{}'", input[(next_start)..(start_offset + clz)].escape_ascii());
xor = xor >> clz;
num = num ^ xor;
next_start = chunk_offset + clz + 1;
//println!("new num {:032b}", num);
line_counter += 1;
}
}
chunk_offset += 16;
}
// get the remaining
set.get_or_insert_owned(&buffer[(next_start)..]);
println!(
"line_counter: {} unique elements {}",
line_counter,
set.len()
);
Ok(set)
}
if I unroll this to do 64 bytes at a time, on a big input it will be slightly faster than memchr. But not much.
Any tips would be appreciated.
I've shown this to a colleague who's come up with better intrinsics code than I would. Here's his suggestion, it's not been compiled, so there needs to be some finishing off of pseudo-code pieces etc, but something along the lines of below should be much faster & work:
let mut line_counter = 0;
for chunk in buffer.chunks(32) { // Read 32 bytes at a time
unsafe {
let src1 = vld1q_u8(chunk.as_ptr());
let src2 = vld1q_u8(chunk.as_ptr() + 16);
let out1 = vceqq_u8(src1, needle_vector);
let out2 = vceqq_u8(src2, needle_vector);
// We slot these next to each other in the same vector.
// In this case the bottom 64-bits of the vector will tell you
// if there are any needle values inside the first vector and
// the top 64-bits tell you if you have any needle values in the
// second vector.
let combined = vpmaxq_u8(out1, out2);
// Now we get another maxp which compresses this information into
// a single 64-bit value, where the bottom 32-bits tell us about
// src1 and the top 32-bit about src2.
let combined = vpmaxq_u8(combined, combined);
let remapped = vreinterpretq_u64_u8 (combined);
let val = vgetq_lane_u64 (remapped, 0);
if (val == 0) // most chunks won't have a new-line
... // If val is 0 that means no match was found in either vectors, adjust offset and continue.
if (val & 0xFFFF)
... // there must be a match in src1. use below code in a function
if (val & 0xFFFF0000)
... // there must be a match in src2. use below code in a function
...
}
}
Now that we now which vector to look in, we should find the index in the vector
As an example, let's assume matchvec is the vector we found above (so either out1 or out2).
To find the first index:
// We create a mark of repeating 0xf00f chunks. when we fill an entire vector
// with it we get a pattern where every byte is 0xf0 or 0x0f. We'll use this
// to find the index of the matches.
let mask = unsafe { vreinterpretq_u16_u8 (vdupq_n_u16 (0xf00f)); }
// We first clear the bits we don't want, which leaves for each adjacent 8-bit entries
// 4 bits of free space alternatingly.
let masked = vandq_u8 (matchvec, mask);
// Which means when we do a pairwise addition
// we are sure that no overflow will ever happen. The entries slot next to each other
// and a non-zero bit indicates the start of the first element.
// We've also compressed the values into the lower 64-bits again.
let compressed = vpaddq_u8 (masked, masked);
let val = vgetq_lane_u64 (compressed, 0);
// Post now contains the index of the first element, every 4 bit is a new entry
// This assumes Rust has kept val on the SIMD side. if it did not, then it's best to
// call vclz on the lower 64-bits of compressed and transfer the results.
let pos = (val).leading_zeros() as usize;
// So just shift pos right by 2 to get the actual index.
let pos = pos >> 2;
pos will now contain the index of the first needle value.
If you were processing out2, remember to add 16 to the result.
To find all the indices we can run through the bitmask without using clz, we avoid the repeated register file transfers this way.
// set masked and compressed as above
let masked = vandq_u8 (matchvec, mask);
let compressed = vpaddq_u8 (masked, masked);
int idx = current_offset;
while (val)
{
if (val & 0xf)
{
// entry found at idx.
}
idx++;
val = val >> 4;
}

How can I have the pixel coordinate X,Y after ArrayFire match_template?

I'm trying to use the matching_template function from the ArrayFire library But I don't know how to find the X and Y coordinates of the best matching value.
I was using the imageproc library to perform this function and there it has the find_extremes function that returns the coordinates to me. How would you do the same using ArrayFire lib?
My example using imageproc
let template = image::open("connect.png").unwrap().to_luma8();
let screenshot = image::open("screenshot.png").unwrap().to_luma8();
let matching_probability= imageproc::template_matching::match_template(&screenshot, &template, MatchTemplateMethod::CrossCorrelationNormalized);
let positions = find_extremes(&matching_probability);
println!("{:?}", positions);
Extremes { max_value: 0.9998113, min_value: 0.42247093,
max_value_location: (843, 696), min_value_location: (657, 832) }
My example using ArrayFire
let template: Array<u8> = arrayfire::load_image(String::from("connect.png"), true);
let screenshot: Array<u8> = arrayfire::load_image(String::from("screenshot.png"), true);
let template_gray = rgb2gray(&template, 0.2126, 0.7152, 0.0722);
let screen_gray = rgb2gray(&screenshot, 0.2126, 0.7152, 0.0722);
let matching_probability = arrayfire::match_template(&screen_gray, &template_gray, arrayfire::MatchType::LSAD);
af_print!("{:?}", matching_probability);
139569.0469 140099.2500 139869.8594 140015.7969 140680.9844 141952.5781 142602.7344 142870.7188...
from here I don't havy any idea how to get the best matching pixel coordinates.
Arrayfire doesn't provide "extremum" function, but separate min and max families of functions.
The one that provides index informations are prefixed with i.
imin_all and imax_all returns the min and max value indexes respectively wrapped in a tupple.
You can derive pixel position from value indexes and array dimensions, knowing that arrayfire is column major.
let template: Array<u8> = arrayfire::load_image(String::from("connect.png"), true);
let screenshot: Array<u8> = arrayfire::load_image(String::from("screenshot.png"), true);
let template_gray = rgb2gray(&template, 0.2126, 0.7152, 0.0722);
let screen_gray = rgb2gray(&screenshot, 0.2126, 0.7152, 0.0722);
let matching_probability = arrayfire::match_template(&screen_gray, &template_gray, arrayfire::MatchType::LSAD);
let (min, _, min_idx) = imin_all(&matching_probability);
let (max, _, max_idx) = imax_all(&matching_probability);
let dims = matching_probability.dims();
let [_, height, _, _] = dims.get();
let px_x_min = min_idx as u64 / height;
let px_y_min = min_idx as u64 % height;
let px_x_max = max_idx as u64 / height;
let px_y_max = max_idx as u64 % height;
af_print!("{:?}", matching_probability);
println!("Minimum value: {} is at pixel ({},{}).",min, px_x_min, px_y_min);
println!("Maximum value: {} is at pixel ({},{}).", max, px_x_max, px_y_max);

Is there an easy way to count booleans in Rust?

I've encountered a scenario in which I have a known, small number of boolean values, but I don't care about them individually, I just want to determine how many of them are true. It seems like there should be a fast way of doing this, but the best I can come up with is the naive solution:
let mut count: u8 = 0;
if a {
count += 1;
}
if b {
count += 1;
}
if c {
count += 1;
}
Is there a better way of doing this?
You could simply do:
let count = a as u8 + b as u8 + c as u8;
I'd like to point you to Michael Anderson's post, which gives a much nicer syntax than mine.
let a = true;
let b = false;
let c = true;
let d = true;
let e = false;
let total_true = [a, b, c, d, e].into_iter().filter(|b| *b).count();
println!("{} options are true!", total_true);
I've left my original answer below for posterity
Original Answer
One option is to put your booleans into a Vector, then filter based on if they're true or not.
let options = vec![true, false, true, false, false, false, true];
let total_true = options.into_iter()
.filter(|b| *b)
.collect::<Vec<bool>>()
.len();
println!("{} options are true!", total_true);
We need to convert Vec into an iter, so we can run filter on it.
Then, since filter checks for true boolean expressions, we simply deref our item as *b to get the non-borrowed boolean value.
Then we collect that back into a Vector, and get the length to find out how many met our criteria (of being true).
This gives you the total number of true booleans.

Is there an efficient function in Rust that finds the index of the first occurrence of a value in a sorted vector?

In [3, 2, 1, 1, 1, 0], if the value we are searching for is 1, then the function should return 2.
I found binary search, but it seems to return the last occurrence.
I do not want a function that iterates over the entire vector and matches one by one.
binary_search assumes that the elements are sorted in less-to-greater order. Yours is reversed, so you can use binary_search_by:
let x = 1; //value to look for
let data = [3,2,1,1,1,0];
let idx = data.binary_search_by(|probe| probe.cmp(x).reverse());
Now, as you say, you do not get the first one. That is expected, for the binary search algorithm will select an arbitrary value equal to the one searched. From the docs:
If there are multiple matches, then any one of the matches could be returned.
That is easily solvable with a loop:
let mut idx = data.binary_search_by(|probe| probe.cmp(&x).reverse());
if let Ok(ref mut i) = idx {
while x > 0 {
if data[*i - 1] != x {
break;
}
*i -= 1;
}
}
But if you expect many duplicates that may negate the advantages of the binary search.
If that is a problem for you, you can try to be smarter. For example, you can take advantage of this comment in the docs of binary_search:
If the value is not found then Result::Err is returned, containing the index where a matching element could be inserted while maintaining sorted order.
So to get the index of the first value with a 1 you look for an imaginary value just between 2 and 1 (remember that your array is reversed), something like 1.5. That can be done hacking a bit the comparison function:
let mut idx = data.binary_search_by(|probe| {
//the 1s in the slice are greater than the 1 in x
probe.cmp(&x).reverse().then(std::cmp::Greater)
});
There is a handy function Ordering::then() that does exactly what we need (the Rust stdlib is amazingly complete).
Or you can use a simpler direct comparison:
let idx = data.binary_search_by(|probe| {
use std::cmp::Ordering::*;
if *probe > x { Less } else { Greater }
});
The only detail left is that this function will always return Err(i), being i either the position of the first 1 or the position where the 1 would be if there are none. An extra comparison is necessary so solve this ambiguity:
if let Err(i) = idx {
//beware! i may be 1 past the end of the slice
if data.get(i) == Some(&x) {
idx = Ok(i);
}
}
Since 1.52.0, [T] has the method partition_point to find the partition point with a predicate in O(log N) time.
In your case, it should be:
let xs = vec![3, 2, 1, 1, 1, 0];
let idx = xs.partition_point(|&a| a > 1);
if idx < xs.len() && xs[idx] == 1 {
println!("Found first 1 idx: {}", idx);
}

Resources