Arithmetic operation order issue with a range - string

I'm trying to learn Rust using the Rust book and the Exercism.io website.
I have an issue with this specific exercise. The code is as follows:
pub fn series(_digits: &str, _len: usize) -> Vec<String> {
(0.._digits.len() + 1 - _len)
.map(|i| _digits[i..i + _len].to_string())
.collect()
}
For example, series("12345", 3)should return a Vec containing ["123", "234", "345"].
Instead of (0.._digits.len() + 1 - _len), I experimented using (0.._digits.len() - _len + 1) instead, but in this case, the unit test "test_too_long" fails:
#[test]
#[ignore]
fn test_too_long() {
let expected: Vec<String> = vec![];
assert_eq!(series("92017", 6), expected);
}
I'm surprised because it looks like it's the same to me. Why did it fail?

This happens because in debug mode, arithmetic operations that would overflow instead panic, and panicking causes tests to fail.
With the rearranged version (playground), in series("12345", 6), digits.len() - len + 1 becomes 5usize - 6usize + 1usize. The program doesn't even get to the + 1, because just 5usize - 6usize panics. (usize can't represent negative numbers, so subtracting 6 from 5 causes overflow.)
The error message contains a strong hint at the nature of the failure:
---- test_too_long stdout ----
thread 'test_too_long' panicked at 'attempt to subtract with overflow', src/lib.rs:2:9
note: Run with `RUST_BACKTRACE=1` for a backtrace.
digits.len() + 1 - len works, however, because 6 is exactly one more than the length of the string, and so 5 + 1 - 6 can evaluate to zero without overflow. But if you change test_too_long to call series("12345", 7) instead, both versions panic. This seems like an oversight on the part of whoever wrote the test suite, especially considering that the instructions don't specify the expected behavior:
And if you ask for a 6-digit series from a 5-digit string, you deserve whatever you get.
For what it's worth, here's one way to make series return an empty vector for any len greater than the length of the input: (digits.len() + 1).saturating_sub(len) is like digits.len() + 1 - len, but if the result of the subtraction would be less than 0, it just returns 0.

Related

Why does Rust perform integer overflow checks in --release?

I have this piece of simple code:
let val: u8 = 255 + 1;
println!("{}", val);
It is said here that such a code will compile normally if run with the --release flag.
I am running this code via cargo run --release, and I still see the checks:
error: this arithmetic operation will overflow
--> src/main.rs:2:19
|
2 | let val: u8 = 255 + 1;
| ^^^^^^^ attempt to compute `u8::MAX + 1_u8`, which would overflow
|
= note: `#[deny(arithmetic_overflow)]` on by default
error: could not compile `rust-bin` due to previous error
Am I missing something?
The book is slightly imprecise. Overflow is disallowed in both debug and release modes, it's just that release mode omits runtime checks for performance reasons (replacing them with overflow, which CPUs typically do anyway). Static checks are not removed because they don't compromise on performance of generated code. This prints 0 in release mode and panics in debug1:
let x: u8 = "255".parse().unwrap();
let val: u8 = x + 1;
println!("{}", val);
You can disable the compile-time checks using #[allow(arithmetic_overflow)]. This also prints 0 in release mode and panics in debug:
#[allow(arithmetic_overflow)]
let val: u8 = 255 + 1;
println!("{}", val);
The correct approach is to not depend on this behavior of release mode, but to tell the compiler what you want. This prints 0 in both debug and release mode:
let val: u8 = 255u8.wrapping_add(1);
println!("{}", val);
1
The example uses "255".parse() because, to my surprise, let x = 255u8; let val = x + 1; doesn't compile - in other words, rustc doesn't just prevent overflow in constant arithmetic, but also wherever else it can prove it happens. The change was apparently made in Rust 1.45, because it compiled in Rust 1.44 and older. Since it broke code that previously compiled, the change was technically backward-incompatible, but presumably broke sufficiently few actual crates that it was deemed worth it. Surprising as it is, it's quite possible that "255".parse::<u8>() + 1 will become a compile-time error in a later release.
In your code, the compiler is able to detect the problem. That's why it prevents it even in release mode. In many cases it's not possible or feasible for the compiler to detect or prevent an error.
Just as an example, imagine you have code like this:
let a = b + 5;
Let's say b's value comes from a database, user input or some other external source. It is literally impossible to prevent overflows in cases like that.

why does max_by_key with i32.log10() causing "attempt to add with overflow"

I'm trying to get the max number of digits in an i32 array. I'm using log10(n) + 1 as my formula to calculate how many digits an i32 has and I thought I would just be able to use that within a max_by_key, but I am getting
thread 'main' panicked at 'attempt to add with overflow', /mnt/f/Personal-Docs/Repos/radix_sort_rs/src/lib/lib.rs:9:60
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Here's my code:
fn get_max_digits<const N: usize>(arr: [i32; N]) -> i32 {
return *arr.iter().max_by_key(|a| a.log10() + 1).unwrap();
}
I'm still new to rust so sorry if this is a simple question.
Figured out the answer to my question not too long after posting. I was trying to do a logarithm of 0 causing undefined. I believe the attempt to add with overflow is the current interaction with 0i32.log10() as it is a nightly only feature.

Getting the length of an int

I am trying to get the length (the number of digits when interpreted in decimal) of an int in rust. I found a way to do it, however am looking for method that comes from the primitive itself. This is what I have:
let num = 90.to_string();
println!("num: {}", num.chars().count())
// num: 2
I am looking at https://docs.rs/digits/0.3.3/digits/struct.Digits.html#method.length. is this a good candidate? How do I use it? Or are there other crates that does it for me?
One liners with less type conversion is the ideal solution I am looking for.
You could loop and check how often you can divide the number by 10 before it becomes a single digit.
Or in the other direction (because division is slower than multiplication), check how often you can multiply 10*10*...*10 until you reach the number:
fn length(n: u32, base: u32) -> u32 {
let mut power = base;
let mut count = 1;
while n >= power {
count += 1;
if let Some(new_power) = power.checked_mul(base) {
power = new_power;
} else {
break;
}
}
count
}
With nightly rust (or in the future, when the int_log feature is stabilized), you can use:
#![feature(int_log)]
n.checked_log10().unwrap_or(0) + 1
Here is a one-liner that doesn't require strings or floating point:
println!("num: {}", successors(Some(n), |&n| (n >= 10).then(|| n / 10)).count());
It simply counts the number of times the initial number needs to be divided by 10 in order to reach 0.
EDIT: the first version of this answer used iterate from the (excellent and highly recommended) itertools crate, but #trentcl pointed out that successors from the stdlib does the same. For reference, here is the version using iterate:
println!("num: {}", iterate(n, |&n| n / 10).take_while(|&n| n > 0).count().max(1));
Here's a (barely) one-liner that's faster than doing a string conversion, using std::iter stuff:
let some_int = 9834;
let decimal_places = (0..).take_while(|i| 10u64.pow(*i) <= some_int).count();
The first method below relies on the following formula, where a and b are the logarithmic bases.
log<a>( x ) = log<b>( x ) / log<b>( a )
log<a>( x ) = log<2>( x ) / log<2>( a ) // Substituting 2 for `b`.
The following function can be applied to finding the number of digits for bases that are a power of 2. This approach is very fast.
fn num_digits_base_pow2(n: u64, b: u32) -> u32
{
(63 - n.leading_zeros()) / (31 - b.leading_zeros()) + 1
}
The bits are counted for both n (the number we want to represent) and b (the base) to find their log2 floor values. Then the adjusted ratio of these values gives the ceiling log value in the desired base.
For a general purpose approach to finding the number of digits for arbitrary bases, the following should suffice.
fn num_digits(n: u64, b: u32) -> u32
{
(n as f64).log(b as f64).ceil() as u32
}
if num is signed:
let digits = (num.abs() as f64 + 0.1).log10().ceil() as u32;
A nice property of numbers that is always good to have in mind is that the number of digits required to write a number $x$ in base $n$ is actually $\lceil log_n(x + 1) \rceil$.
Therefore, one can simply write the following function (notice the cast from u32 to f32, since integers don't have a log function).
fn length(n: u32, base: u32) -> u32 {
let n = (n+1) as f32;
n.log(base as f32).ceil() as u32
}
You can easily adapt it for negative numbers. For floating point numbers this might be a bit (i.e. a lot) more tricky.
To take into account Daniel's comment about the pathological cases introduced by using f32, note that, with nightly Rust, integers have a logarithm method. (Notice that, imo, those are implementation details, and you should more focus on understanding the algorithm than the implementation.):
#![feature(int_log)]
fn length(n: u32, base: u32) -> u32 {
n.log(base) + 1
}

Is there a way to `f64::from(0.23_f32)` and get 0.23_f64?

I'm trying to tie together two pieces of software: one that gives me a f32, and one that expects f64 values. In my code, I use f64::from(my_f32), but in my test, I compare the outcome and the value that I'm comparing has not been converted as expected: the f64 value has a bunch of extra, more precise, digits, such that the values aren't equal.
In my case, the value is 0.23. Is there a way to convert the 0.23_f32 to f64 such that I end up with 0.23_f64 instead of 0.23000000417232513?
fn main() {
let x = 0.23_f32;
println!("{}", x);
println!("{}", f64::from(x));
println!("---");
let x = 0.23_f64;
println!("{}", x);
println!("{}", f64::from(x));
}
Playground
Edit: I understand that floating-point numbers are stored differently--in fact, I use this handy visualizer on occasion to view the differences in representations between 32-bit and 64-bit floats. I was looking to see if there's some clever way to get around this.
Edit 2: A "clever" example that I just conjured up would be my_32.to_string().parse::<f64>()--that gets me 0.23_f64, but (obviously) requires string parsing. I'd like to think there might be something at least slightly more numbers-related (for lack of a better term).
Comments have already pointed out why this is happening. This answer exists to give you ways to circumvent this.
The first (and most obvious) is to use arbitrary-precision libraries. A solid example of this in rust is rug. This allows you to express pretty much any number exactly, but it causes some problems across FFI boundaries (amongst other cases).
The second is to do what most people do around floating point numbers, and bracket your equalities. Since you know that most floats will not be stored exactly, and you know your input type, you can use constants such as std::f32::MIN to bracket your type, like so (playground):
use std::cmp::PartialOrd;
use std::ops::{Add, Div, Sub};
fn bracketed_eq<
I,
E: From<I> + From<f32> + Clone + PartialOrd + Div<Output = E> + Sub<Output = E> + Add<Output = E>,
>(
input: E,
target: I,
value: I,
) -> bool {
let target: E = target.into();
let value: E = value.into();
let bracket_lhs: E = target.clone() - (value.clone() / (2.0).into());
let bracket_rhs: E = target.clone() + (value.clone() / (2.0).into());
bracket_lhs >= input && bracket_rhs <= input
}
#[test]
fn test() {
let u: f32 = 0.23_f32;
assert!(bracketed_eq(f64::from(u), 0.23, std::f32::MIN))
}
A large amount of this is boilerplate and a lot of it gets completely optimized away by the compiler; it is also possible to drop the Clone requirement by restricting some trait choices. Add, Sub, Div are there for the operations, From<I> to realize the conversion, From<f32> for the constant 2.0.
The right way to compare floating-point values is to bracket them. The question is how to determine the bracketing interval? In your case, since you have a representation of the target value as f32, you have two solutions:
The obvious solution is to do the comparison between f32s, so convert your f64 result to f32 to get rid of the extra digits, and compare that to the expected result. Of course, this may still fail if accumulated rounding errors cause the result to be slightly different.
The right solution would have been to use the next_after function to get the smallest bracketing interval around your target:
let result: f64 = 0.23f64;
let expect: f32 = 0.23;
assert_ne!(result, expect.into());
assert!(expect.next_after (0.0).into() < result && result < expect.next_after (1.0).into());
but unfortunately this was never stabilized (see #27752).
So you will have to determine the precision that is acceptable to you, possibly as a function of f32::EPSILON:
let result: f64 = 0.23f64;
let expect: f32 = 0.23;
assert_ne!(result, expect.into());
assert!(f64::from (expect) - f64::from (std::f32::EPSILON) < result && result < f64::from (expect) + f64::from (std::f32::EPSILON);
If you don't want to compare the value, but instead want to truncate it before passing it on to some computation, then the function to use is f64::round:
const PRECISION: f64 = 100.0;
let from_db: f32 = 0.23;
let truncated = (f64::from (from_db) * PRECISION).round() / PRECISION;
println!("f32 : {:.32}", from_db);
println!("f64 : {:.32}", 0.23f64);
println!("output: {:.32}", truncated);
prints:
f32 : 0.23000000417232513427734375000000
f64 : 0.23000000000000000999200722162641
output: 0.23000000000000000999200722162641
A couple of notes:
The result is still not equal to 0.23 since that number cannot be represented as an f64 (or as an f32 for that matter), but it is as close as you can get.
If there are legal implications as you implied, then you probably shouldn't be using floating point numbers in the first place but you should use either some kind of fixed-point with the legally mandated precision, or some arbitrary precision library.

Assign return value when index check out of bounds

In pseudo-code, I'm trying the following:
for i in len(array):
try:
a = array[i-1]
except(out_of_bounds_error):
a = false
where array is just made up of booleans.
In the book (Chapter 9.2) it says you can check whether a function returns a result or not with something like:
let a: u32 = array[i-1]
which tells me a is indeed a bool. Without a Result type, how do I handle the inevitable (and expected) attempt to subtract with overflow error at runtime?
The error attempt to subtract with overflow occurs when computing i - 1 when i == 0. Array indices must be of type usize, which is an unsigned type, and unsigned types cannot represent negative numbers, which 0 - 1 would produce. In a debug build, the compiler generates code that raises this error, but in a release build, the compiler generates code that will simply compute the "wrong" value (in this case, this happens to be usize::max_value()).
You can avoid this error in both debug builds and release builds by performing a checked subtraction instead. checked_sub returns an Option: you'll get a Some if the subtraction succeeded or None if it failed. You can then use map_or on that Option to read the array only if the subtraction produced a valid index.
fn main() {
let a = vec![true; 10];
for i in 0..a.len() {
let b = i.checked_sub(1).map_or(false, |j| a[j]);
println!("b: {}", b);
}
}
Arrays (or rather, slices) also have a get method that returns an Option if the index is out of bounds, instead of panicking. If we were adding one to the index, instead of subtracting one, we could do this:
fn main() {
let a = vec![true; 10];
for i in 0..a.len() {
let b = i.checked_add(1).and_then(|j| a.get(j).cloned()).unwrap_or(false);
println!("b: {}", b);
}
}
This time, we're using and_then to chain an operation that produces an Option with another operation that also produces an Option, and we're using unwrap_or to get the Option's value or a default value if it's None.

Resources