Wrapping around negative numbers in Rust - rust

I'm rewriting C code in Rust which heavily relies on u32 variables and wrapping them around. For example, I have a loop defined like this:
#define NWORDS 24
#define ZERO_WORDS 11
int main()
{
unsigned int i, j;
for (i = 0; i < NWORDS; i++) {
for (j = 0; j < i; j++) {
if (j < (i-ZERO_WORDS+1)) {
}
}
}
return 0;
}
Now, the if statement will need to wrap around u32 for a few values as initially i = 0. I came across the wrapping_neg method but it seems to just compute -self. Is there any more flexible way to work with u32 in Rust by also allowing wrapping?

As mentioned in the comments, the literal answer to your question is to use u32::wrapping_sub and u32::wrapping_add:
const NWORDS: u32 = 24;
const ZERO_WORDS: u32 = 11;
fn main() {
for i in 0..NWORDS {
for j in 0..i {
if j < i.wrapping_sub(ZERO_WORDS).wrapping_add(1) {}
}
}
}
However, I'd advocate avoiding relying on wrapping operations unless you are performing hashing / cryptography / compression / something similar. Wrapping operations are non-intuitive. For example, j < i-ZERO_WORDS+1 doesn't have the same results as j+ZERO_WORDS < i+1.
Even better would be to rewrite the logic. I can't even tell in which circumstances that if expression will be true without spending a lot of time thinking about it!
It turns out that the condition will be evaluated for i=9, j=8, but not for i=10, j=0. Perhaps all of this is clearer in the real code, but devoid of context it's very confusing.
This appears to have the same logic, but seems much more understandable to me:
i < ZERO_WORDS - 1 || i - j > ZERO_WORDS - 1;
Compare:
j < i.wrapping_sub(ZERO_WORDS).wrapping_add(1);

Related

Fast Walsh-Hadamard Transform in Rust

I'm having a hard time to get a Rust code to compute the Walsh-Hadamard transform efficiently inplace. The algorithm is by essence highly parallelizable but it requires two nested for loops.
This is easy in C, see below the code using OpenMP which is fast.
void FWHT32(int32_t* array, int n){
uint32_t d, i, j;
for(d=n; d > 0; d--){
uint32_t D = ((uint32_t)1<<d);
#pragma omp parallel for collapse(2)
for(j=0; j < ((uint32_t) 1 << (n-d)); j++){
for(i=0; i < (D >> 1); i++){
array[D*j+i] = array[D*j+i] + array[D*j+(D>>1)+i];
array[D*j+(D>>1)+i] = array[D*j+i] - 2*array[D*j+(D>>1)+i];
}
}
}
}
But in Rust I get a really hard time iterating in parallel over two distant values, my best attempt so far doesn't parallelize the inner loop:
use std::ops::{AddAssign, Sub};
use rayon::prelude::*;
pub fn _par_fast_walsh_hadamard_tr<T: Send + Copy + Sized + AddAssign + Sub<Output = T>>(array: &mut [T]) {
let arlen = array.len();
let n = arlen.trailing_zeros() as usize;
if arlen != (1 << n) {
panic!("Length must be a factor of 2.");
}
for step in 0..n {
let s = 1 << (n - step);
array.par_chunks_exact_mut(s)
.for_each(|chunk| {
for j in 0..s/2{
let a = chunk[j];
chunk[j] += chunk[j+s/2];
chunk[j+s/2] = a - chunk[j+s/2];
}
});
}
}
But even this runs only slightly faster on 6 cores / 12 threads compared to a trivial sequential Rust implementation.
Is Rayon not the right parallel framework for this?
How can one parallelize the inner loop?

Longest Common Substring non-DP solution with O(m*n)

The definition of the problem is:
Given two strings, find the longest common substring.
Return the length of it.
I was solving this problem and I think I solved it with O(m*n) time complexity. However I don't know why when I look up the solution, it's all talking about the optimal solution being dynamic programming - http://www.geeksforgeeks.org/longest-common-substring/
Here's my solution, you can test it here: http://www.lintcode.com/en/problem/longest-common-substring/
int longestCommonSubstring(string &A, string &B) {
int ans = 0;
for (int i=0; i<A.length(); i++) {
int counter = 0;
int k = i;
for (int j=0; j<B.length() && k <A.length(); j++) {
if (A[k]!=B[j]) {
counter = 0;
k = i;
} else {
k++;
counter++;
ans = max(ans, counter);
}
}
}
return ans;
}
My idea is simple, start from the first position of string A and see what's the longest substring I can match with string B, then start from the second position of string A and see what's the longest substring I can match....
Is there something wrong with my solution? Or is it not O(m*n) complexity?
Good news: your algorithm is O(mn). Bad news: it doesn't work correctly.
Your inner loop is wrong: it's intended to find the longest initial substring of A[i:] in B, but it works like this:
j = 0
While j < len(B)
Match as much of A[i:] against B[j:]. Call it s.
Remember s if it's the longest so far found.
j += len(s)
This fails to find the longest match. For example, when A = "XXY" and B = "XXXY" and i=0 it'll find "XX" as the longest match instead of the complete match "XXY".
Here's a runnable version of your code (lightly transcribed into C) that shows the faulty result:
#include <string.h>
#include <stdio.h>
int lcs(const char* A, const char* B) {
int al = strlen(A);
int bl = strlen(B);
int ans = 0;
for (int i=0; i<al; i++) {
int counter = 0;
int k = i;
for (int j=0; j<bl && k<al; j++) {
if (A[k]!=B[j]) {
counter = 0;
k = i;
} else {
k++;
counter++;
if (counter >= ans) ans = counter;
}
}
}
return ans;
}
int main(int argc, char**argv) {
printf("%d\n", lcs("XXY", "XXXY"));
return 0;
}
Running this program outputs "2".
Your solution is O(nm) complexity and if you look compare the structure to the provided algorithm its the exact same; however, yours does not memoize.
One advantage that the dynamic algorithm provided in the link has is that in the same complexity class time it can recall different substring lengths in O(1); otherwise, it looks good to me.
This is a kind of thing will happen from time to time because storing subspace solutions will not always result in a better run time (on first call) and result in the same complexity class runtime instead (eg. try to compute the nth Fibonacci number with a dynamic solution and compare that to a tail recursive solution. Note that in this case like your case, after the array is filled the first time, its faster to return an answer each successive call.

interview riddle (string manipulation) - explanation needed

i am studying for an interview and encountered a question + solution.
i am having a problem with one line in the solution and was hoping maybe someone here can explain it.
the question:
Write a method to replace all spaces in a string with ‘%20’.
the solution:
public static void ReplaceFun(char[] str, int length) {
int spaceCount = 0, newLength, i = 0;
for (i = 0; i < length; i++) {
if (str[i] == ‘ ‘) {
spaceCount++;
}
}
newLength = length + spaceCount * 2;
str[newLength] = ‘\0’;
for (i = length - 1; i >= 0; i--) {
if (str[i] == ‘ ‘) {
str[newLength - 1] = ‘0’;
str[newLength - 2] = ‘2’;
str[newLength - 3] = ‘%’;
newLength = newLength - 3;
} else {
str[newLength - 1] = str[i];
newLength = newLength - 1;
}
}
}
my problem is with line number 9. how can he just set str[newLength] to '\0'? or in other words, how can he take over the needed amount of memory without allocating it first or something like that?
isn't he running over a memory?!
Assuming this is actually meant to be in C (private static is not valid C or C++), they can't, as it's written. They're never allocating a new str which will be long enough to hold the old string plus the %20 expansion.
I suspect there's an additional part to the question, which is that str is already long enough to hold the expanded %20 data, and that length is the length of the string in str, not counting the zero terminator.
This is valid code, but it's not good code. You are completely correct in your assessment that we are overwriting the bounds of the initial str[]. This could cause some rather unwanted side-effects depending on what was being overwritten.

Generate 50 random numbers and store them into an array c++

this is what i have of the function so far. This is only the beginning of the problem, it is asking to generate the random numbers in a 10 by 5 group of numbers for the output, then after this it is to be sorted by number size, but i am just trying to get this first part down.
/* Populate the array with 50 randomly generated integer values
* in the range 1-50. */
void populateArray(int ar[], const int n) {
int n;
for (int i = 1; i <= length - 1; i++){
for (int i = 1; i <= ARRAY_SIZE; i++) {
i = rand() % 10 + 1;
ar[n]++;
}
}
}
First of all we want to use std::array; It has some nice property, one of which is that it doesn't decay as a pointer. Another is that it knows its size. In this case we are going to use templates to make populateArray a generic enough algorithm.
template<std::size_t N>
void populateArray(std::array<int, N>& array) { ... }
Then, we would like to remove all "raw" for loops. std::generate_n in combination with some random generator seems a good option.
For the number generator we can use <random>. Specifically std::uniform_int_distribution. For that we need to get some generator up and running:
std::random_device device;
std::mt19937 generator(device());
std::uniform_int_distribution<> dist(1, N);
and use it in our std::generate_n algorithm:
std::generate_n(array.begin(), N, [&dist, &generator](){
return dist(generator);
});
Live demo

Convert For loop into Parallel.For loop

public void DoSomething(byte[] array, byte[] array2, int start, int counter)
{
int length = array.Length;
int index = 0;
while (count >= needleLen)
{
index = Array.IndexOf(array, array2[0], start, count - length + 1);
int i = 0;
int p = 0;
for (i = 0, p = index; i < length; i++, p++)
{
if (array[p] != array2[i])
{
break;
}
}
Given that your for loop appears to be using a loop body dependent on ordering, it's most likely not a candidate for parallelization.
However, you aren't showing the "work" involved here, so it's difficult to tell what it's doing. Since the loop relies on both i and p, and it appears that they would vary independently, it's unlikely to be rewritten using a simple Parallel.For without reworking or rethinking your algorithm.
In order for a loop body to be a good candidate for parallelization, it typically needs to be order independent, and have no ordering constraints. The fact that you're basing your loop on two independent variables suggests that these requirements are not valid in this algorithm.

Resources