In (stable) Rust, is there a relatively straightforward method of implementing the following function?
fn mod_euclid(val: i128, modulo: u128) -> u128;
Note the types! That is, 'standard' euclidean modulus (result is always in the range of [0, mod)), avoiding spurious overflow/underflow in the intermediate calculation. Some test cases:
// don't-care, just no panic or UB.
// Mild preference for treating this as though it was mod=1<<128 instead of 0.
assert_dc!(mod_euclid(i128::MAX, 0));
assert_dc!(mod_euclid( 0, 0));
assert_dc!(mod_euclid(i128::MIN, 0));
assert_eq!(mod_euclid( 1, 10), 1);
assert_eq!(mod_euclid( -1, 10), 9);
assert_eq!(mod_euclid( 11, 10), 1);
assert_eq!(mod_euclid( -11, 10), 9);
assert_eq!(mod_euclid(i128::MAX, 1), 0);
assert_eq!(mod_euclid( 0, 1), 0);
assert_eq!(mod_euclid(i128::MIN, 1), 0);
assert_eq!(mod_euclid(i128::MAX, u128::MAX), i128::MAX as u128);
assert_eq!(mod_euclid( 0, u128::MAX), 0);
assert_eq!(mod_euclid(i128::MIN, u128::MAX), i128::MAX as u128);
For signed%signed->signed, or unsigned%unsigned->unsigned, this is relatively straightforward. However, I can't find a good way of calculating signed % unsigned -> unsigned without converting one of the arguments - and as the last example illustrates, this may overflow or underflow no matter which direction you choose.
As far as I can tell, there is no such function in the standard library, but it's not very difficult to write one yourself:
fn mod_euclid(a: i128, b: u128) -> u128 {
if a >= 0 {
(a as u128) % b
} else {
let r = (!a as u128) % b;
b - r - 1
}
}
Playground link
How it works:
If a is non-negative then it's straightforward - just use the unsigned remainder operator.
Otherwise, the bitwise complement !a is non-negative (because the sign bit is flipped), and numerically equal to -a - 1. This means r is equivalent to b - a - 1 modulo b, and hence b - r - 1 is equivalent to a modulo b. Conveniently, b - r - 1 is in the expected range 0..b.
Maybe a little bit more straight forward, use rem_euclid where possible and else return the positive value equivalent to a:
pub fn mod_euclid(a: i128, b: u128) -> u128 {
const UPPER: u128 = i128::MAX as u128;
match b {
1..=UPPER => a.rem_euclid(b as i128) as u128,
_ if a >= 0 => a as u128,
// turn a from two's complement negative into it's
// equivalent positive value by adding u128::MAX
// essentialy calculating u128::MAX - |a|
_ => u128::MAX.wrapping_add_signed(a),
//_ => a as u128 - (a < 0) as u128,
}
}
(The parser didn't like my casting in the match hence UPPER)
Playground
Results in a little fewer instructions & jumps on x86_64 as well.
How can I split a single integer into 100 parts?
The max value and last value will be the input value.
The code is what I have come up with so far:
fn main() {
let max_value: i32 = 6543;
let part = max_value.checked_div(100).unwrap();
for no in 1..=100 {
let num = if no == 100 { max_value } else { part * no };
println!("{}", num);
}
}
This works well if max_value is 100 or larger:
65
130
195
260
325
390
455
[…]
6240
6305
6370
6435
6543
But for max_value smaller than 100 it doesn't work at all:
let max_value: i32 = 90;
0
0
0
0
[…]
0
0
0
0
90
How can I do this properly?
Here is an approach with the following characteristics:
Pure integer arithmetic.
No overflow as long as the divisor d is less than 2 * u32::MAX (or the max of whatever integer type you are using).
O(1) integer divisions.
Steps are as evenly spaced as possible – the result is the same as using n * i / d for i in 1..=d, but with only integer additions in each loop iteration, and without the problems with integer overflow.
fn divide_integer(n: u32, d: u32) -> impl Iterator<Item = u32> {
let step = n / d;
let rem_step = n % d;
(0..d).scan((0, 0), move |(current, rem), _| {
*current += step;
*rem += rem_step;
if *rem >= d {
*rem -= d;
*current += 1;
}
Some(*current)
})
}
This increases the current value current by n / d in each iteration, but also keeps track of the remainder rem. The increment for the remainder rem_step is always less than d, so rem will always be less than 2 * d, so rem cannot overflow if d <= u32::MAX. The value of current is less than or equal to n at all times, so current can never overflow.
This question already has answers here:
How do I get an absolute value in Rust?
(6 answers)
Closed 3 years ago.
If there are two numbers equal closest to zero (-2 0 2), I want to return the positive number.
let y = x.iter().min_by_key(|&num| ( num - given_num).abs()).unwrap_or(&given_num);
This prints only the closest value to 0, but doesn't fix my problem.
Rust can compare tuples, so you can return a tuple in the min_by_key closure to prioritize results that would otherwise be equal:
let y = x.iter()
.min_by_key (|&num| ((num - given_num).abs(), -num.signum()))
.unwrap_or (&given_num);
Try something like this: First filter out the zero, then use fold to compare minimum absolute values. If both are the same, use the max between the real values.
use std::cmp;
use std::i32;
fn main() {
let x: Vec<i32> = vec![-2, 0, 2];
let y = x
.iter()
.filter(|&i| *i != 0)
.fold(i32::MAX, |n, &i| {
if n.abs() < i.abs() {
n
} else if n.abs() == i.abs() {
cmp::max(n, i)
} else {
i
}
});
println!("{:}", y);
}
In [3, 2, 1, 1, 1, 0], if the value we are searching for is 1, then the function should return 2.
I found binary search, but it seems to return the last occurrence.
I do not want a function that iterates over the entire vector and matches one by one.
binary_search assumes that the elements are sorted in less-to-greater order. Yours is reversed, so you can use binary_search_by:
let x = 1; //value to look for
let data = [3,2,1,1,1,0];
let idx = data.binary_search_by(|probe| probe.cmp(x).reverse());
Now, as you say, you do not get the first one. That is expected, for the binary search algorithm will select an arbitrary value equal to the one searched. From the docs:
If there are multiple matches, then any one of the matches could be returned.
That is easily solvable with a loop:
let mut idx = data.binary_search_by(|probe| probe.cmp(&x).reverse());
if let Ok(ref mut i) = idx {
while x > 0 {
if data[*i - 1] != x {
break;
}
*i -= 1;
}
}
But if you expect many duplicates that may negate the advantages of the binary search.
If that is a problem for you, you can try to be smarter. For example, you can take advantage of this comment in the docs of binary_search:
If the value is not found then Result::Err is returned, containing the index where a matching element could be inserted while maintaining sorted order.
So to get the index of the first value with a 1 you look for an imaginary value just between 2 and 1 (remember that your array is reversed), something like 1.5. That can be done hacking a bit the comparison function:
let mut idx = data.binary_search_by(|probe| {
//the 1s in the slice are greater than the 1 in x
probe.cmp(&x).reverse().then(std::cmp::Greater)
});
There is a handy function Ordering::then() that does exactly what we need (the Rust stdlib is amazingly complete).
Or you can use a simpler direct comparison:
let idx = data.binary_search_by(|probe| {
use std::cmp::Ordering::*;
if *probe > x { Less } else { Greater }
});
The only detail left is that this function will always return Err(i), being i either the position of the first 1 or the position where the 1 would be if there are none. An extra comparison is necessary so solve this ambiguity:
if let Err(i) = idx {
//beware! i may be 1 past the end of the slice
if data.get(i) == Some(&x) {
idx = Ok(i);
}
}
Since 1.52.0, [T] has the method partition_point to find the partition point with a predicate in O(log N) time.
In your case, it should be:
let xs = vec![3, 2, 1, 1, 1, 0];
let idx = xs.partition_point(|&a| a > 1);
if idx < xs.len() && xs[idx] == 1 {
println!("Found first 1 idx: {}", idx);
}
I'm pretty new to algorithms and runtimes, and I'm trying to optimise a bit of my code for a personal project.
import math
for num in range(0, 10000000000000000000000):
if all((num**(num+1)+(num+1)**(num))%i!=0 for i in range(2,int(math.sqrt((num**(num+1)+(num+1)**(num))))+1)):
print(num)
What can I do to speed this up? I know that num=80 should work but my code isn't getting past num=0, 1, 2 (it's not fast enough).
First I define my range, then I say if 'such-and-such' is prime from range 2 to sqrt(such-and-such) + 1, then return that number. Sqrt(n) + 1 is the minimum number of factors to test for the primality of n.
This is a primality test of sequence A051442
You would probably get a minor boost from computing (num**(num+1)+(num+1)**(num)) only once per iteration instead of sqrt(num**(num+1)+(num+1)**(num)) times. As you can see, this will greatly reduce the constant factor in your complexity. It won't change the fundamental complexity because you still need to compute the remainder. Change
if all((num**(num+1)+(num+1)**(num))%i!=0 for i in range(2,int(math.sqrt((num**(num+1)+(num+1)**(num))))+1)):
to
k = num**(num+1)+(num+1)**(num)
if all(k%i for i in range(2,int(math.sqrt(k))+1)):
The != 0 is implicit in Python.
Update
All this is just trivial improvement to an extremely inefficieny algorithm. The biggest speedup I can think of is to reduce the check k % i to only prime i. For any composite i = a * b such that k % i == 0, it must be the case that k % a == 0 and k % b == 0 (if k is divisible by i, it must also be divisible by the factors of i).
I am assuming that you don't want to use any kind of pre-computed prime tables in your code. In that case, you can compute the table yourself. This will involve checking all the numbers up to a given sqrt(k) only once ever, instead of once per iteration of num, since we can stash the previously computed primes in say a list. That will effectively increase the lower limit of the range in your current all from 2 to the square root of the previous k.
Let's define a function to extend our set of primes using the seive of Eratosthenes:
from math import sqrt
def extend(primes, from_, to):
"""
primes: a sequence containing prime numbers from 2 to `from - 1`, in order
from_: the number to start checking with
to: the number to end with (inclusive)
"""
if not primes:
primes.extend([2, 3])
return
for k in range(max(from_, 5), to + 1):
s = int(sqrt(k)) # No need to compute this more than once per k
for p in primes:
if p > s:
# Reached sqrt(k) -> short circuit success
primes.append(k)
break
elif not k % p:
# Found factor -> short circuit failure
break
Now we can use this function to extend our list of primes at every iteration of the original loop. This allows us to check the divisibility of k only against the slowly growing list of primes, not against all numbers:
primes = []
prev = 0
for num in range(10000000000000000000000):
k = num**(num + 1) + (num + 1)**num
lim = int(sqrt(k)) + 1
extend(primes, prev, lim)
#print('Num={}, K={}, checking {}-{}, p={}'.format(num, k, prev, lim, primes), end='... ')
if k <= 3 and k in primes or all(k % i for i in primes):
print('{}: {} Prime!'.format(num, k))
else:
print('{}: {} Nope'.format(num, k))
prev = lim + 1
I am not 100% sure that my extend function is optimal, but I am able to get to num == 13, k == 4731091158953433 in <10 minutes on my ridiculously old and slow laptop, so I guess it's not too bad. That means that the algorithm builds a complete table of primes up to ~7e7 in that time.
Update #2
A sort-of-but-not-really optimization you could do would be to check all(k % i for i in primes) before calling extend. This would save you a lot of cycles for numbers that have small prime factors, but would probably catch up to you later on, when you would end up having to compute all the primes up to some enormous number. Here is a sample of how you could do that:
primes = []
prev = 0
for num in range(10000000000000000000000):
k = num**(num + 1) + (num + 1)**num
lim = int(sqrt(k)) + 1
if not all(k % i for i in primes):
print('{}: {} Nope'.format(num, k))
continue
start = len(primes)
extend(primes, prev, lim)
if all(k % i for i in primes[start:]):
print('{}: {} Prime!'.format(num, k))
else:
print('{}: {} Nope'.format(num, k))
prev = lim + 1
While this version does not do much for the long run, it does explain why you were able to get to 15 so quickly in your original run. The prime table does note get extended after num == 3, until num == 16, which is when the terrible delay occurs in this version as well. The net runtime to 16 should be identical in both versions.
Update #3
As #paxdiablo suggests, the only numbers we need to consider in extend are multiples of 6 +/- 1. We can combine that with the fact that only a small number of primes generally need to be tested, and convert the functionality of extend into a generator that will only compute as many primes as absolutely necessary. Using Python's lazy generation should help. Here is a completely rewritten version:
from itertools import count
from math import ceil, sqrt
prime_table = [2, 3]
def prime_candidates(start=0):
"""
Infinite generator of prime number candidates starting with the
specified number.
Candidates are 2, 3 and all numbers that are of the form 6n-1 and 6n+1
"""
if start <= 3:
if start <= 2:
yield 2
yield 3
start = 5
delta = 2
else:
m = start % 6
if m < 2:
start += 1 - m
delta = 4
else:
start += 5 - m
delta = 2
while True:
yield start
start += delta
delta = 6 - delta
def isprime(n):
"""
Checks if `n` is prime.
All primes up to sqrt(n) are expected to already be present in
the generated `prime_table`.
"""
s = int(ceil(sqrt(n)))
for p in prime_table:
if p > s:
break
if not n % p:
return False
return True
def generate_primes(max):
"""
Generates primes up to the specified maximum.
First the existing table is yielded. Then, the new primes are
found in the sequence generated by `prime_candidates`. All verified
primes are added to the existing cache.
"""
for p in prime_table:
if p > max:
return
yield p
for k in prime_candidates(prime_table[-1] + 1):
if isprime(k):
prime_table.append(k)
if k > max:
# Putting the return here ensures that we always stop on a prime and therefore don't do any extra work
return
else:
yield k
for num in count():
k = num**(num + 1) + (num + 1)**num
lim = int(ceil(sqrt(k)))
b = all(k % i for i in generate_primes(lim))
print('n={}, k={} is {}prime'.format(num, k, '' if b else 'not '))
This version gets to 15 almost instantly. It gets stuck at 16 because the smallest prime factor for k=343809097055019694337 is 573645313. Some future expectations:
17 should be a breeze: 16248996011806421522977 has factor 19
18 will take a while: 812362695653248917890473 has factor 22156214713
19 is easy: 42832853457545958193355601 is divisible by 3
20 also easy: 2375370429446951548637196401 is divisible by 58967
21: 138213776357206521921578463913 is divisible by 13
22: 8419259736788826438132968480177 is divisible by 103
etc... (link to sequence)
So in terms of instant gratification, this method will get you much further if you can make it past 18 (which will take >100 times longer than getting past 16, which in my case took ~1.25hrs).
That being said, your greatest speedup at this point would be re-writing this in C or some similar low-level language that does not have as much overhead for loops.
Update #4
Just for giggles, here is an implementation of the latest Python version in C. I chose to go with GMP for arbitrary precision integers, because it is easy to use and install on my Red Hat system, and the docs are very clear:
#include <stdio.h>
#include <stdlib.h>
#include <gmp.h>
typedef struct {
size_t alloc;
size_t size;
mpz_t *numbers;
} PrimeTable;
void init_table(PrimeTable *buf)
{
buf->alloc = 0x100000L;
buf->size = 2;
buf->numbers = malloc(buf->alloc * sizeof(mpz_t));
if(buf == NULL) {
fprintf(stderr, "No memory for prime table\n");
exit(1);
}
mpz_init_set_si(buf->numbers[0], 2);
mpz_init_set_si(buf->numbers[1], 3);
return;
}
void append_table(PrimeTable *buf, mpz_t number)
{
if(buf->size == buf->alloc) {
size_t new = 2 * buf->alloc;
mpz_t *tmp = realloc(buf->numbers, new * sizeof(mpz_t));
if(tmp == NULL) {
fprintf(stderr, "Ran out of memory for prime table\n");
exit(1);
}
buf->alloc = new;
buf->numbers = tmp;
}
mpz_set(buf->numbers[buf->size], number);
buf->size++;
return;
}
size_t print_table(PrimeTable *buf, FILE *file)
{
size_t i, n;
n = fprintf(file, "Array contents = [");
for(i = 0; i < buf->size; i++) {
n += mpz_out_str(file, 10, buf->numbers[i]);
if(i < buf->size - 1)
n += fprintf(file, ", ");
}
n += fprintf(file, "]\n");
return n;
}
void free_table(PrimeTable *buf)
{
for(buf->size--; ((signed)(buf->size)) >= 0; buf->size--)
mpz_clear(buf->numbers[buf->size]);
free(buf->numbers);
return;
}
int isprime(mpz_t num, PrimeTable *table)
{
mpz_t max, rem, next;
size_t i, d, r;
mpz_inits(max, rem, NULL);
mpz_sqrtrem(max, rem, num);
// Check if perfect square: definitely not prime
if(!mpz_cmp_si(rem, 0)) {
mpz_clears(rem, max, NULL);
return 0;
}
/* Normal table lookup */
for(i = 0; i < table->size; i++) {
// Got to sqrt(n) -> prime
if(mpz_cmp(max, table->numbers[i]) < 0) {
mpz_clears(rem, max, NULL);
return 1;
}
// Found a factor -> not prime
if(mpz_divisible_p(num, table->numbers[i])) {
mpz_clears(rem, max, NULL);
return 0;
}
}
/* Extend table and do lookup */
// Start with last found prime + 2
mpz_init_set(next, table->numbers[i - 1]);
mpz_add_ui(next, next, 2);
// Find nearest number of form 6n-1 or 6n+1
r = mpz_fdiv_ui(next, 6);
if(r < 2) {
mpz_add_ui(next, next, 1 - r);
d = 4;
} else {
mpz_add_ui(next, next, 5 - r);
d = 2;
}
// Step along numbers of form 6n-1/6n+1. Check each candidate for
// primality. Don't stop until next prime after sqrt(n) to avoid
// duplication.
for(;;) {
if(isprime(next, table)) {
append_table(table, next);
if(mpz_divisible_p(num, next)) {
mpz_clears(next, rem, max, NULL);
return 0;
}
if(mpz_cmp(max, next) <= 0) {
mpz_clears(next, rem, max, NULL);
return 1;
}
}
mpz_add_ui(next, next, d);
d = 6 - d;
}
// Return can only happen from within loop.
}
int main(int argc, char *argv[])
{
PrimeTable table;
mpz_t k, a, b;
size_t n, next;
int p;
init_table(&table);
mpz_inits(k, a, b, NULL);
for(n = 0; ; n = next) {
next = n + 1;
mpz_set_ui(a, n);
mpz_pow_ui(a, a, next);
mpz_set_ui(b, next);
mpz_pow_ui(b, b, n);
mpz_add(k, a, b);
p = isprime(k, &table);
printf("n=%ld k=", n);
mpz_out_str(stdout, 10, k);
printf(" p=%d\n", p);
//print_table(&table, stdout);
}
mpz_clears(b, a, k, NULL);
free_table(&table);
return 0;
}
While this version has the exact same algorithmic complexity as the Python one, I expect it to run a few orders of magnitude faster because of the relatively minimal overhead incurred in C. And indeed, it took about 15 minutes to get stuck at n == 18, which is ~5 times faster than the Python version so far.
Update #5
This is going to be the last one, I promise.
GMP has a function called mpz_nextprime, which offers a potentially much faster implementation of this algorightm, especially with caching. According to the docs:
This function uses a probabilistic algorithm to identify primes. For practical purposes it’s adequate, the chance of a composite passing will be extremely small.
This means that it is probably much faster than the current prime generator I implemented, with a slight cost offset of some false primes being added to the cache. This cost should be minimal: even adding a few thousand extra modulo operations should be fine if the prime generator is faster than it is now.
The only part that needs to be replaced/modified is the portion of isprime below the comment /* Extend table and do lookup */. Basically that whole section just becomes a series of calls to mpz_nextprime instead of recursion.
At that point, you may as well adapt isprime to use mpz_probab_prime_p when possible. You only need to check for sure if the result of mpz_probab_prime_p is uncertain:
int isprime(mpz_t num, PrimeTable *table)
{
mpz_t max, rem, next;
size_t i, r;
int status;
status = mpz_probab_prime_p(num, 50);
// Status = 2 -> definite yes, Status = 0 -> definite no
if(status != 1)
return status != 0;
mpz_inits(max, rem, NULL);
mpz_sqrtrem(max, rem, num);
// Check if perfect square: definitely not prime
if(!mpz_cmp_si(rem, 0)) {
mpz_clears(rem, max, NULL);
return 0;
}
mpz_clear(rem);
/* Normal table lookup */
for(i = 0; i < table->size; i++) {
// Got to sqrt(n) -> prime
if(mpz_cmp(max, table->numbers[i]) < 0) {
mpz_clear(max);
return 1;
}
// Found a factor -> not prime
if(mpz_divisible_p(num, table->numbers[i])) {
mpz_clear(max);
return 0;
}
}
/* Extend table and do lookup */
// Start with last found prime + 2
mpz_init_set(next, table->numbers[i - 1]);
mpz_add_ui(next, next, 2);
// Step along probable primes
for(;;) {
mpz_nextprime(next, next);
append_table(table, next);
if(mpz_divisible_p(num, next)) {
r = 0;
break;
}
if(mpz_cmp(max, next) <= 0) {
r = 1;
break;
}
}
mpz_clears(next, max, NULL);
return r;
}
Sure enough, this version makes it to n == 79 in a couple of seconds at most. It appears to get stuck on n == 80, probably because mpz_probab_prime_p can't determine if k is a prime for sure. I doubt that computing all the primes up to ~10^80 is going to take a trivial amount of time.