Can Modulo be Distributed? - string

It can in Math, but I came across this when I was looking into Rabin-Carp string searching algorithm. The hash function they used (source: wikipedia: https://en.wikipedia.org/wiki/Rabin%E2%80%93Karp_algorithm#Hash_function_used) was this:
[(104 × 256 ) % 101 + 105] % 101 = 65
How is this better than deleting the inner mod operator so you only have one on the outside? As so:
[104 × 256 + 105] % 101
As far as I can tell it should give the same result, and mods are generally expensive operations, so wouldn't it be better to have one?
The only thing I can think of is concerns about overflow, but if that were the case, the multiplication would be similarly split up, like so:
(104 % 101 × 256 % 101 ) % 101 + 105] % 101 = 65

When you implement a formula, in general you try to have the very same outlook. Let's suppose that the formula looks like this:
[(x × y ) % z + t] % w
In our case z and w have the very same value, but they could be different. If you simplify the formula to match your case, then in the future, if the differences between z and w start to creep in, then you will have trouble finding out what was meant by the code. Yet, if z and w are entangled and it is guaranteed that they will be entangled in the future as well, then you might consider this simplification. Yet, you also need to be careful while doing so, because if x and y are fairly large, then you might have some number overflow issues in some cases when adding t to it. Also, if t is very large, you might have number overflow problems.
As about your question,
[a % b + c] % b
is equivalent to
[a + c] % b
mathematically. But in the actual code there might be some nuances that justify the seemingly superfluousness of the code.

Related

Divide a value in excel by a set of preset values to find out how many of each are needed

I am curious if there is a way to make my life easier. In excel I am producing a total value, say 750 and need to find out how many orders of pipe I need from values of 50,100,200,250,500. Is there anyway to have excel take a value and then return how many of each of these numbers I would need, so for the 750 case 1 500 and 1 250?
Currently the solution is just worked out in my head
Assuming you want to try to fit pipes in decreasing order of size,and that you have access to the required functions, you can use Reduce as demonstrated here to step through the sizes and successively divide by each one although the formula is a little laboured:
=LET(pipes,{500;250;200;100;50},reqd,750,DROP(REDUCE(0,pipes,
LAMBDA(a,c,VSTACK(a,QUOTIENT(reqd-IF(ROWS(a)>1,SUM(DROP(a,1)*TAKE(pipes,ROWS(a)-1)),0),c)))),1))
As pointed out by #Jos Woolley, this may not give you the answer you want if the total is something like 749. It will fit as many values in as possible and give a result 500+200 total 700 (remainder 49). You could fix it perhaps by rounding up to the next multiple of 50.
For the example of 823, you would have:
=LET(pipes,{500;250;200;100;50},reqd,CEILING(823,MIN(pipes)),DROP(REDUCE(0,pipes,
LAMBDA(a,c,VSTACK(a,QUOTIENT(reqd-IF(ROWS(a)>1,SUM(DROP(a,1)*TAKE(pipes,ROWS(a)-1)),0),c)))),1))
which gives 500+250+100=850.
Well I've got a bit obsessed with this now and I am determined to get a lambda working to find the optimal answer! I have been looking at the brute-force solution to finding the minimum number of coins required to make up a given total in the reference mentioned previously and have managed to translate it into a lambda using Reduce:
Mincoins1= LAMBDA(coins, m, v,
IF(
v <= 0,
0,
REDUCE(
999,
coins,
LAMBDA(a, c,
IF(v >= c, LET(mc, mincoins1.mincoins1(coins, m, v - c) + 1, IF(mc < a, mc, a)), a)
)
)
)
)
This does give the correct answer, 2, for the case when you want to make up a value of 400 from the list of pipes given. The next step will be to modify the code to return the list of pipes which give that total (200,200).
https://www.enjoyalgorithms.com/blog/minimum-coin-change
Here is the lambda modified to return a string containing the chosen pipes:
Mincoins2= LAMBDA(coins, m, v,
IF(
v <= 0,
"",
REDUCE(
rept("x",999),
coins,
LAMBDA(a, c,
IF(v >= c, LET(mc, c&"|"&mincoins2.mincoins2(coins, m, v - c), IF(len(mc) < len(a), mc, a)), a)
)
)
)
);
It does work BUT (and this is a big but) it hits a limit as soon as the value to be produced exceeds 1000 and you get a #value error. Disappointing. But interesting I think as a proof of concept.
Not sure I understand the question but lets try.
if you have 1 450 to divide, have a formula that divides 1 450 with you highest lenght (750) and then round it down.
so the formula would be something of the line: = rounddown(1 450 / 750; 0)
you will then get the answer that you need 1 of the length 750.
then keep the info about how much length you have remaining. So a formula like:
=1 450 - 750 * [the answer from previous formula = 1]. this would sum to 700.
then start over with the same thing, but divide 700 with 500 (second largest size).
Your question is extremely difficult: one might think for this easy solution, starting with value_begin:
amount_of_500 = value_begin DIV 500; // integer division
temp = value_begin - 500 * amount_of_500;
amount_of_250 = temp DIV 250; // again integer division
temp = temp - 250 * amount_of_250;
amount_of_200 = temp DIV 200; // again integer division
temp = temp - 200 * amount_of_200;
...
However, this will not work because of the value 200, which is far too close to 250: just start with value_begin equal to 400 (algorithm solution : 250 + 100 + 50, while best solution : 200 + 200).
Are you sure you need both 200 and 250 as possible numbers to divide by? If yes, you might have a serious problem getting this implemented.

Calculating a custom probability distribution in python (numerically)

I have a custom (discrete) probability distribution defined somewhat in the form: f(x)/(sum(f(x')) for x' in a given discrete set X). Also, 0<=x<=1.
So I have been trying to implement it in python 3.8.2, and the problem is that the numerator and denominator both come out to be really small and python's floating point representation just takes them as 0.0.
After calculating these probabilities, I need to sample a random element from an array, whose each index may be selected with the corresponding probability in the distribution. So if my distribution is [p1,p2,p3,p4], and my array is [a1,a2,a3,a4], then probability of selecting a2 is p2 and so on.
So how can I implement this in an elegant and efficient way?
Is there any way I could use the np.random.beta() in this case? Since the difference between the beta distribution and my actual distribution is only that the normalization constant differs and the domain is restricted to a few points.
Note: The Probability Mass function defined above is actually in the form given by the Bayes theorem and f(x)=x^s*(1-x)^f, where s and f are fixed numbers for a given iteration. So the exact problem is that, when s or f become really large, this thing goes to 0.
You could well compute things by working with logs. The point is that while both the numerator and denominator might underflow to 0, their logs won't unless your numbers are really astonishingly small.
You say
f(x) = x^s*(1-x)^t
so
logf (x) = s*log(x) + t*log(1-x)
and you want to compute, say
p = f(x) / Sum{ y in X | f(y)}
so
p = exp( logf(x) - log sum { y in X | f(y)}
= exp( logf(x) - log sum { y in X | exp( logf( y))}
The only difficulty is in computing the second term, but this is a common problem, for example here
On the other hand computing logsumexp is easy enough to to by hand.
We want
S = log( sum{ i | exp(l[i])})
if L is the maximum of the l[i] then
S = log( exp(L)*sum{ i | exp(l[i]-L)})
= L + log( sum{ i | exp( l[i]-L)})
The last sum can be computed as written, because each term is now between 0 and 1 so there is no danger of overflow, and one of the terms (the one for which l[i]==L) is 1, and so if other terms underflow, that is harmless.
This may however lose a little accuracy. A refinement would be to recognize the set A of indices where
l[i]>=L-eps (eps a user set parameter, eg 1)
And then compute
N = Sum{ i in A | exp(l[i]-L)}
B = log1p( Sum{ i not in A | exp(l[i]-L)}/N)
S = L + log( N) + B

Coin Change Optimization

I'm trying to solve this problem:
Suppose I have a set of n coins {a_1, a2, ..., a_n}. A coin with value
1 will always appear. What is the minimum number of coins I
need to reach M?
The constraints are:
1 ≤ n ≤ 25
1 ≤ m ≤ 10^6
1 ≤ a_i ≤ 100
Ok, I know that it's the Change-making problem.
I have tried to solve this problem using Breadth-First Search, Dynamic Programming and Greedly (which is incorrect, since it don't always give best solution). However, I get Time Limit Exceeded (3 seconds).
So I wonder if there's an optimization for this problem.
The description and the constraints called my attention, but I don't know how to use it in my favour:
A coin with value 1 will always appear.
1 ≤ a_i ≤ 100
I saw at wikipedia that this problem can also be solved by "Dynamic programming with the probabilistic convolution tree". But I could not understand anything.
Can you help me?
This problem can be found here: http://goo.gl/nzQJem
Let a_n be the largest coin. Use these two clues:
result is >= ceil(M/a_n),
result configuration has lot of a_n's.
It is best to try with maximum of a_n's and than check if it is better result with less a_n's till it is possible to find better result.
Something like: let R({a_1, ..., a_n}, M) be function that returns result for a given problem. Than R can be implemented:
num_a_n = floor(M/a_n)
best_r = num_a_n + R({a_1, ..., a_(n-1)}, M-a_n*num_a_n)
while num_a_n > 0:
num_a_n = num_a_n - 1
# Check is it possible at all to get better result
if num_a_n + ceil(M-a_n*num_a_n / a_(n-1) ) >= best_r:
return best_r
next_r = num_a_n + R({a_1, ..., a_(n-1)}, M-a_n*num_a_n)
if next_r < best_r:
best_r = next_r
return best_r

Reverse Interpolation

I have a class implementing an audio stream that can be read at varying speed (including reverse and fast varying / "scratching")... I use linear interpolation for the read part and everything works quite decently..
But now I want to implement writing to the stream at varying speed as well and that requires me to implement a kind of "reverse interpolation" i.e. Deduce the input sample vector Z that, interpolated with vector Y will produce the output X (which I'm trying to write)..
I've managed to do it for constant speeds, but generalising for varying speeds (e.g accelerating or decelerating) is proving more complicated..
I imagine this problem has been solved repeatedly, but I can't seem to find many clues online, so my specific question is if anyone has heard of this problem and can point me in the right direction (or, even better, show me a solution :)
Thanks!
I would not call it "reverse interpolation" as that does not exists (my first thought was you were talking about extrapolation!). What you are doing is still simply interpolation, just at an uneven rate.
Interpolation: finding a value between known values
Extrapolation: finding a value beyond known values
Interpolating to/from constant rates is indeed much much simpler than the generic quest of "finding a value between known values". I propose 2 solutions.
1) Interpolate to a significantly higher rate, and then just sub-sample to the nearest one (try adding dithering)
2) Solve the generic problem: for each point you need to use the neighboring N points and fit a order N-1 polynomial to them.
N=2 would be linear and would add overtones (C0 continuity)
N=3 could leave you with step changes at the halfway point between your source samples (perhaps worse overtones than N=2!)
N=4 will get you C1 continuity (slope will match as you change to the next sample), surely enough for your application.
Let me explain that last one.
For each output sample use the 2 previous and 2 following input samples. Call them S0 to S3 on a unit time scale (multiply by your sample period later), and you are interpolating from time 0 to 1. Y is your output and Y' is the slope.
Y will be calculated from this polynomial and its differential (slope)
Y(t) = At^3 + Bt^2 + Ct + D
Y'(t) = 3At^2 + 2Bt + C
The constraints (the values and slope at the endpoints on either side)
Y(0) = S1
Y'(0) = (S2-S0)/2
Y(1) = S2
Y'(1) = (S3-S1)/2
Expanding the polynomial
Y(0) = D
Y'(0) = C
Y(1) = A+B+C+D
Y'(1) = 3A+2B+C
Plugging in the Samples
D = S1
C = (S2-S0)/2
A + B = S2 - C - D
3A+2B = (S3-S1)/2 - C
The last 2 are a system of equations that are easily solvable. Subtract 2x the first from the second.
3A+2B - 2(A+B)= (S3-S1)/2 - C - 2(S2 - C - D)
A = (S3-S1)/2 + C - 2(S2 - D)
Then B is
B = S2 - A - C - D
Once you have A, B, C and D you can put in an time 't' in the polynomial to find a sample value between your known samples.
Repeat for every output sample, reuse A,B,C&D if the next output sample is still between the same 2 input samples. Calculating t each time is similar to Bresenham's line algorithm, you're just advancing by a different amount each time.

Does ((a^x) ^ 1/x) == a in Zp? (for Jablon's protocol)

I have to implement Jablon's protocol (paper) but I've been sitting on a bug for two hours.
I'm not very good with math so I don't know if it's my fault in writing it or it just isn't possible. If it isn't possible, I don't see how Jablon's protocol can be implemented since it relies on the fact that ((gP ^ x) ^ yi) ^ (1/x) == gP^yi .
Take the following code. It doesn't work.
BigInteger p = new BigInteger("101");
BigInteger a = new BigInteger("83");
BigInteger x = new BigInteger("13");
BigInteger ax = a.modPow(x, p);
BigInteger xinv = x.modInverse(p);
BigInteger axxinv = ax.modPow(xinv, p);
if (a.equals(axxinv))
System.out.println("Yay!");
else
System.out.println("How is this possible?");
Your problem is that you're not calculating k(1/x) correctly. We need k(1/x))k to be x. Fermat's Little Theorem tells us that kp-1 is 1 mod p. Therefore we want to find y such that x * y is 1 mod p-1, not mod p.
So you want BigInteger xinv = x.modInverse(p-1);.
This will not work if x shares a common factor with p-1. (Your case avoids that.) For that, you need additional theory.
If p is a prime, then r is a primitive root if none of r, r^2, r^3, ..., r^(p-2) are congruent to 1 mod p. There is no simple algorithm to produce a primitive root, but they are common so you usually only need to check a few. (For p=101, the first number I tried, 2, turned out to be a primitive root. 83 is also.) Testing them would seem to be hard, but it isn't so bad since it turns out that (omitting a bunch of theory here) only divisors of p-1 need to be checked. For instance for 101 you only need to check the powers 1, 2, 4, 5, 10, 20, 25 and 50.
Now if r is a primitive root, then every number mod p is some power of r. What power? That's called the discrete logarithm problem and is not simple. (It's difficulty is the basis of RSA, which is a well known cryptography system.) You can do it with trial division. So trying 1, 2, 3, ... you eventually find that, for instance, 83 is 2^89 (mod 101).
But once we know that every number from 1 to 100 is 2 to some power, we are armed with a way to calculate roots. Because raising a number to the power of x just multiplies the exponent by x. And 2^100 is 1. So exponentiation is multiplying by x (mod 100).
So suppose that we want y ^ 13 to be 83. Then y is 2^k for some k such that k * 13 is 89. If you play around with the Chinese Remainder Theorem you can realize that k = 53 works. Therefore 2^53 (mod 101) = 93 is the 13'th root of 89.
That is harder than what we did before. But suppose that we wanted to take, say, the 5th root of 44 mod 101. We can't use the simple procedure because 5 does not have a multiplicative inverse mod 100. However 44 is 2^15. Therefore 2^3 = 8 is a 5th root. But there are 4 others, namely 2^23, 2^43, 2^63 and 2^83.

Resources