Getting the length of an int - rust

I am trying to get the length (the number of digits when interpreted in decimal) of an int in rust. I found a way to do it, however am looking for method that comes from the primitive itself. This is what I have:
let num = 90.to_string();
println!("num: {}", num.chars().count())
// num: 2
I am looking at https://docs.rs/digits/0.3.3/digits/struct.Digits.html#method.length. is this a good candidate? How do I use it? Or are there other crates that does it for me?
One liners with less type conversion is the ideal solution I am looking for.

You could loop and check how often you can divide the number by 10 before it becomes a single digit.
Or in the other direction (because division is slower than multiplication), check how often you can multiply 10*10*...*10 until you reach the number:
fn length(n: u32, base: u32) -> u32 {
let mut power = base;
let mut count = 1;
while n >= power {
count += 1;
if let Some(new_power) = power.checked_mul(base) {
power = new_power;
} else {
break;
}
}
count
}
With nightly rust (or in the future, when the int_log feature is stabilized), you can use:
#![feature(int_log)]
n.checked_log10().unwrap_or(0) + 1

Here is a one-liner that doesn't require strings or floating point:
println!("num: {}", successors(Some(n), |&n| (n >= 10).then(|| n / 10)).count());
It simply counts the number of times the initial number needs to be divided by 10 in order to reach 0.
EDIT: the first version of this answer used iterate from the (excellent and highly recommended) itertools crate, but #trentcl pointed out that successors from the stdlib does the same. For reference, here is the version using iterate:
println!("num: {}", iterate(n, |&n| n / 10).take_while(|&n| n > 0).count().max(1));

Here's a (barely) one-liner that's faster than doing a string conversion, using std::iter stuff:
let some_int = 9834;
let decimal_places = (0..).take_while(|i| 10u64.pow(*i) <= some_int).count();

The first method below relies on the following formula, where a and b are the logarithmic bases.
log<a>( x ) = log<b>( x ) / log<b>( a )
log<a>( x ) = log<2>( x ) / log<2>( a ) // Substituting 2 for `b`.
The following function can be applied to finding the number of digits for bases that are a power of 2. This approach is very fast.
fn num_digits_base_pow2(n: u64, b: u32) -> u32
{
(63 - n.leading_zeros()) / (31 - b.leading_zeros()) + 1
}
The bits are counted for both n (the number we want to represent) and b (the base) to find their log2 floor values. Then the adjusted ratio of these values gives the ceiling log value in the desired base.
For a general purpose approach to finding the number of digits for arbitrary bases, the following should suffice.
fn num_digits(n: u64, b: u32) -> u32
{
(n as f64).log(b as f64).ceil() as u32
}

if num is signed:
let digits = (num.abs() as f64 + 0.1).log10().ceil() as u32;

A nice property of numbers that is always good to have in mind is that the number of digits required to write a number $x$ in base $n$ is actually $\lceil log_n(x + 1) \rceil$.
Therefore, one can simply write the following function (notice the cast from u32 to f32, since integers don't have a log function).
fn length(n: u32, base: u32) -> u32 {
let n = (n+1) as f32;
n.log(base as f32).ceil() as u32
}
You can easily adapt it for negative numbers. For floating point numbers this might be a bit (i.e. a lot) more tricky.
To take into account Daniel's comment about the pathological cases introduced by using f32, note that, with nightly Rust, integers have a logarithm method. (Notice that, imo, those are implementation details, and you should more focus on understanding the algorithm than the implementation.):
#![feature(int_log)]
fn length(n: u32, base: u32) -> u32 {
n.log(base) + 1
}

Related

workaround for not being able to use natural log in constant assignment

I am trying to write a program to find the nth prime. To do this I use the limit given by the prime number theorem:
p(n) < n(ln(n) + ln(ln(n))
to calculate an upper bound for the nth prime number, use constant generics to create an array of that length and then perform a sieve of Eratosthenes to find all primes in that array. However, I have run into the issue that my current code will not compile due to the line:
const N_PRIME: i32 = 10_001;
// ceil use instead of floor to remove possibility of floating point error causing issue with bound
// 1 added to account for array indices starting from 0
const UP_TO: i32 = {let np = N_PRIME as f64; (np * (np.ln() + np.ln().ln())).ceil() as i32 + 1};
with the error message
const UP_TO: i32 = {let np = N_PRIME as f64; (np * (np.ln() + np.ln().ln())).ceil() as i32 + 1};
| ^^^^
|
= note: calls in constants are limited to constant functions, tuple structs and tuple variants
naturally I could simply calculate UP_TO and hardcode it, or I could rewrite the code to use vectors. However, the former approach is more clunky and the latter would be slightly slower so a workaround would be preferable. Can anyone think of a way to get the needed value into the UP_TO constant?
Thanks
Edit: Code suggested by pigionhands (as I understand it), this code results in the same error:
const N_PRIME: i32 = 10_001;
fn up_to<const N_PRIME: i32>() -> usize {
let np = N_PRIME as f64;
(np * (np.ln() + np.ln().ln())).ceil() as usize + 1
}
fn main() {
println!("{}", UP_TO);
let test_arr = [0; up_to::<N_PRIME>()];
}

Datatype for indexing vector

I just started to learn Rust. I understand that Rust's for loop indices and vector indices must be of type usize, hence I have written the following code. The computation of j within the for loop requires i to be type u32, so I convert it. Now, I have to change the type of i and j again to get the vector items.
I would like to avoid this constant back-and-forth conversion, is there an alternate way to do this in Rust? Thank you for your help.
fn compute(dots: Vec, N: u32) -> f32 {
let mut j: u32;
let mut value: f32 = 0.0;
for i in 0..N {
j = (i as u32 + 1) % N;
value += dots[i as usize].a * dots[j as usize].b;
value -= dots[i as usize].b * dots[j as usize].a;
}
return value
}
Either change the function signature to use N: usize, or, if you can't do that, just let M = N as usize and loop over 0..M (the loop variable will then have type usize).
Be aware that in real code, you need to be sure that usize is at least as wide as u32 if you opt for the conversion. If you cannot assure that, use try_into instead of as to convert.

Is there a way to `f64::from(0.23_f32)` and get 0.23_f64?

I'm trying to tie together two pieces of software: one that gives me a f32, and one that expects f64 values. In my code, I use f64::from(my_f32), but in my test, I compare the outcome and the value that I'm comparing has not been converted as expected: the f64 value has a bunch of extra, more precise, digits, such that the values aren't equal.
In my case, the value is 0.23. Is there a way to convert the 0.23_f32 to f64 such that I end up with 0.23_f64 instead of 0.23000000417232513?
fn main() {
let x = 0.23_f32;
println!("{}", x);
println!("{}", f64::from(x));
println!("---");
let x = 0.23_f64;
println!("{}", x);
println!("{}", f64::from(x));
}
Playground
Edit: I understand that floating-point numbers are stored differently--in fact, I use this handy visualizer on occasion to view the differences in representations between 32-bit and 64-bit floats. I was looking to see if there's some clever way to get around this.
Edit 2: A "clever" example that I just conjured up would be my_32.to_string().parse::<f64>()--that gets me 0.23_f64, but (obviously) requires string parsing. I'd like to think there might be something at least slightly more numbers-related (for lack of a better term).
Comments have already pointed out why this is happening. This answer exists to give you ways to circumvent this.
The first (and most obvious) is to use arbitrary-precision libraries. A solid example of this in rust is rug. This allows you to express pretty much any number exactly, but it causes some problems across FFI boundaries (amongst other cases).
The second is to do what most people do around floating point numbers, and bracket your equalities. Since you know that most floats will not be stored exactly, and you know your input type, you can use constants such as std::f32::MIN to bracket your type, like so (playground):
use std::cmp::PartialOrd;
use std::ops::{Add, Div, Sub};
fn bracketed_eq<
I,
E: From<I> + From<f32> + Clone + PartialOrd + Div<Output = E> + Sub<Output = E> + Add<Output = E>,
>(
input: E,
target: I,
value: I,
) -> bool {
let target: E = target.into();
let value: E = value.into();
let bracket_lhs: E = target.clone() - (value.clone() / (2.0).into());
let bracket_rhs: E = target.clone() + (value.clone() / (2.0).into());
bracket_lhs >= input && bracket_rhs <= input
}
#[test]
fn test() {
let u: f32 = 0.23_f32;
assert!(bracketed_eq(f64::from(u), 0.23, std::f32::MIN))
}
A large amount of this is boilerplate and a lot of it gets completely optimized away by the compiler; it is also possible to drop the Clone requirement by restricting some trait choices. Add, Sub, Div are there for the operations, From<I> to realize the conversion, From<f32> for the constant 2.0.
The right way to compare floating-point values is to bracket them. The question is how to determine the bracketing interval? In your case, since you have a representation of the target value as f32, you have two solutions:
The obvious solution is to do the comparison between f32s, so convert your f64 result to f32 to get rid of the extra digits, and compare that to the expected result. Of course, this may still fail if accumulated rounding errors cause the result to be slightly different.
The right solution would have been to use the next_after function to get the smallest bracketing interval around your target:
let result: f64 = 0.23f64;
let expect: f32 = 0.23;
assert_ne!(result, expect.into());
assert!(expect.next_after (0.0).into() < result && result < expect.next_after (1.0).into());
but unfortunately this was never stabilized (see #27752).
So you will have to determine the precision that is acceptable to you, possibly as a function of f32::EPSILON:
let result: f64 = 0.23f64;
let expect: f32 = 0.23;
assert_ne!(result, expect.into());
assert!(f64::from (expect) - f64::from (std::f32::EPSILON) < result && result < f64::from (expect) + f64::from (std::f32::EPSILON);
If you don't want to compare the value, but instead want to truncate it before passing it on to some computation, then the function to use is f64::round:
const PRECISION: f64 = 100.0;
let from_db: f32 = 0.23;
let truncated = (f64::from (from_db) * PRECISION).round() / PRECISION;
println!("f32 : {:.32}", from_db);
println!("f64 : {:.32}", 0.23f64);
println!("output: {:.32}", truncated);
prints:
f32 : 0.23000000417232513427734375000000
f64 : 0.23000000000000000999200722162641
output: 0.23000000000000000999200722162641
A couple of notes:
The result is still not equal to 0.23 since that number cannot be represented as an f64 (or as an f32 for that matter), but it is as close as you can get.
If there are legal implications as you implied, then you probably shouldn't be using floating point numbers in the first place but you should use either some kind of fixed-point with the legally mandated precision, or some arbitrary precision library.

C-style switch statement with fall-through in Rust [duplicate]

I’m new to Rust, but as a fan of Haskell, I greatly appreciate the way match works in Rust. Now I’m faced with the rare case where I do need fall-through – in the sense that I would like all matching cases of several overlapping ones to be executed. This works:
fn options(stairs: i32) -> i32 {
if stairs == 0 {
return 1;
}
let mut count: i32 = 0;
if stairs >= 1 {
count += options(stairs - 1);
}
if stairs >= 2 {
count += options(stairs - 2);
}
if stairs >= 3 {
count += options(stairs - 3);
}
count
}
My question is whether this is idiomatic in Rust or whether there is a better way.
The context is a question from Cracking the Coding Interview: “A child is running up a staircase with n steps and can hop either 1 step, 2 steps, or 3 steps at a time. Implement a method to count how many possible ways the child can run up the stairs.”
Based on the definition of the tribonacci sequence I found you could write it in a more concise manner like this:
fn options(stairs: i32) -> i32 {
match stairs {
0 => 0,
1 => 1,
2 => 1,
3 => 2,
_ => options(stairs - 1) + options(stairs - 2) + options(stairs - 3)
}
}
I would also recommend changing the funtion definition to only accept positive integers, e.g. u32.
To answer the generic question, I would argue that match and fallthrough are somewhat antithetical.
match is used to be able to perform different actions based on the different patterns. Most of the time, the very values extracted via pattern matching are so different than a fallthrough does not make sense.
A fallthrough, instead, points to a sequence of actions. There are many ways to express sequences: recursion, iteration, ...
In your case, for example, one could use a loop:
for i in 1..4 {
if stairs >= i {
count += options(stairs - i);
}
}
Of course, I find #ljedrz' solution even more elegant in this particular instance.
I would advise to avoid recursion in Rust. It is better to use iterators:
struct Trib(usize, usize, usize);
impl Default for Trib {
fn default() -> Trib {
Trib(1, 0, 0)
}
}
impl Iterator for Trib {
type Item = usize;
fn next(&mut self) -> Option<usize> {
let &mut Trib(a, b, c) = self;
let d = a + b + c;
*self = Trib(b, c, d);
Some(d)
}
}
fn options(stairs: usize) -> usize {
Trib::default().take(stairs + 1).last().unwrap()
}
fn main() {
for (i, v) in Trib::default().enumerate().take(10) {
println!("i={}, t={}", i, v);
}
println!("{}", options(0));
println!("{}", options(1));
println!("{}", options(3));
println!("{}", options(7));
}
Playground
Your code looks pretty idiomatic to me, although #ljedrz has suggested an even more elegant rewriting of the same strategy.
Since this is an interview problem, it's worth mentioning that neither solution is going to be seen as an amazing answer because both solutions take exponential time in the number of stairs.
Here is what I might write if I were trying to crack a coding interview:
fn options(stairs: usize) -> u128 {
let mut o = vec![1, 1, 2, 4];
for _ in 3..stairs {
o.push(o[o.len() - 1] + o[o.len() - 2] + o[o.len() - 3]);
}
o[stairs]
}
Instead of recomputing options(n) each time, we cache each value in an array. So, this should run in linear time instead of exponential time. I also switched to a u128 to be able to return solutions for larger inputs.
Keep in mind that this is not the most efficient solution because it uses linear space. You can get away with using constant space by only keeping track of the final three elements of the array. I chose this as a compromise between conciseness, readability, and efficiency.

Is `iter().map().sum()` as fast as `iter().fold()`?

Does the compiler generate the same code for iter().map().sum() and iter().fold()? In the end they achieve the same goal, but the first code would iterate two times, once for the map and once for the sum.
Here is an example. Which version would be faster in total?
pub fn square(s: u32) -> u64 {
match s {
s # 1...64 => 2u64.pow(s - 1),
_ => panic!("Square must be between 1 and 64")
}
}
pub fn total() -> u64 {
// A fold
(0..64).fold(0u64, |r, s| r + square(s + 1))
// or a map
(1..64).map(square).sum()
}
What would be good tools to look at the assembly or benchmark this?
For them to generate the same code, they'd first have to do the same thing. Your two examples do not:
fn total_fold() -> u64 {
(0..64).fold(0u64, |r, s| r + square(s + 1))
}
fn total_map() -> u64 {
(1..64).map(square).sum()
}
fn main() {
println!("{}", total_fold());
println!("{}", total_map());
}
18446744073709551615
9223372036854775807
Let's assume you meant
fn total_fold() -> u64 {
(1..64).fold(0u64, |r, s| r + square(s + 1))
}
fn total_map() -> u64 {
(1..64).map(|i| square(i + 1)).sum()
}
There are a few avenues to check:
The generated LLVM IR
The generated assembly
Benchmark
The easiest source for the IR and assembly is one of the playgrounds (official or alternate). These both have buttons to view the assembly or IR. You can also pass --emit=llvm-ir or --emit=asm to the compiler to generate these files.
Make sure to generate assembly or IR in release mode. The attribute #[inline(never)] is often useful to keep functions separate to find them easier in the output.
Benchmarking is documented in The Rust Programming Language, so there's no need to repeat all that valuable information.
Before Rust 1.14, these do not produce the exact same assembly. I'd wait for benchmarking / profiling data to see if there's any meaningful impact on performance before I worried.
As of Rust 1.14, they do produce the same assembly! This is one reason I love Rust. You can write clear and idiomatic code and smart people come along and make it equally as fast.
but the first code would iterate two times, once for the map and once for the sum.
This is incorrect, and I'd love to know what source told you this so we can go correct it at that point and prevent future misunderstandings. An iterator operates on a pull basis; one element is processed at a time. The core method is next, which yields a single value, running just enough computation to produce that value.
First, let's fix those example to actually return the same result:
pub fn total_fold_iter() -> u64 {
(1..65).fold(0u64, |r, s| r + square(s))
}
pub fn total_map_iter() -> u64 {
(1..65).map(square).sum()
}
Now, let's develop them, starting with fold. A fold is just a loop and an accumulator, it is roughly equivalent to:
pub fn total_fold_explicit() -> u64 {
let mut total = 0;
for i in 1..65 {
total = total + square(i);
}
total
}
Then, let's go with map and sum, and unwrap the sum first, which is roughly equivalent to:
pub fn total_map_partial_iter() -> u64 {
let mut total = 0;
for i in (1..65).map(square) {
total += i;
}
total
}
It's just a simple accumulator! And now, let's unwrap the map layer (which only applies a function), obtaining something that is roughly equivalent to:
pub fn total_map_explicit() -> u64 {
let mut total = 0;
for i in 1..65 {
let s = square(i);
total += s;
}
total
}
As you can see, the both of them are extremely similar: they have apply the same operations in the same order and have the same overall complexity.
Which is faster? I have no idea. And a micro-benchmark may only tell half the truth anyway: just because something is faster in a micro-benchmark does not mean it is faster in the midst of other code.
What I can say, however, is that they both have equivalent complexity and therefore should behave similarly, ie within a factor of each other.
And that I would personally go for map + sum, because it expresses the intent more clearly whereas fold is the "kitchen-sink" of Iterator methods and therefore far less informative.

Resources