How to sum a slice of bytes reducing the possibility of overflow

How to sum a slice of bytes reducing the possibility of overflow - string

I have an ASCII string slice and I need to compute the sum of all characters when seen as bytes.
let word = "Hello, World";
let sum = word.as_bytes().iter().sum::<u8>();
I need to specify the type for sum, otherwise Rust will not compile. The problem is that u8 is a too small type, and if the sum overflows the program will panic.
I'd like to avoid that, but I cannot find a way to specify a bigger type such as u16 or u32 for example, when using sum().
I may try to use fold(), but I was wondering if there is a way to use sum() by specifying another type.
let sum = word.as_bytes().iter().fold(0u32, |acc, x| acc + *x as u32);

You can use map to cast each byte to a bigger type:
let sum: u32 = word.as_bytes().iter().map(|&b| b as u32).sum();
or
let sum: u32 = word.as_bytes().iter().cloned().map(u32::from).sum();
The reason why you can't sum to u32 using your original attempt is that the Sum trait which provides it has the following definition:
pub trait Sum<A = Self> {
fn sum<I>(iter: I) -> Self
where
I: Iterator<Item = A>;
}
Which means that its method sum returns by default the same type as the items of the iterator it is built from. You can see it's the case with u8 by looking at its implementation of Sum:
fn sum<I>(iter: I) -> u8
where
I: Iterator<Item = u8>,

Related

&[T] changed to Vec<&T>

I wrote a function that accepts a slice of single digit numbers and returns a number.
pub fn from_digits<T>(digits: &[T]) -> T
where
T: Num + FromPrimitive + Add + Mul + Copy,
{
let mut ret: T = T::zero();
let ten: T = T::from_i8(10).unwrap();
for d in digits {
ret = ret * ten + **d;
}
ret
}
For example, from_digits(&vec![1,2,3,4,5]) returns 12345. This seems to work fine.
Now, I want to use this function in another code:
let ret: Vec<Vec<i64>> = digits // &[i64]
.iter() // Iter<i64>
.rev() // impl Iterator<Item = i64>
.permutations(len) // Permutations<Rev<Iter<i64>>>
.map(|ds| from_digits(&ds)) // <-- ds: Vec<&i64>
.collect();
The problem is that after permutations(), the type in the lambda in map is Vec<&i64>, not Vec<i64>. This caused the compile error, as the expected parameter type is &[T], not &[&T].
I don't understand why the type of ds became Vec<&i64>. I tried to change the from_digits like this:
pub fn from_digits<T>(digits: &[&T]) -> T
...
But I am not sure if this is the correct way to fix the issue. Also, this caused another problem that I cannot pass simple data like vec![1,2,3] to the function.
Can you let me know the correct way to fix this?

The problem is that slice's iter() function returns an iterator over &T, so &i64.
The fix is to use the Iterator::copied() or Iterator::cloned() adapters that converts and iterator over &T to an iterator over T when T: Copy or T: Clone, respectively:
digits.iter().copied().rev() // ...

Datatype for indexing vector

I just started to learn Rust. I understand that Rust's for loop indices and vector indices must be of type usize, hence I have written the following code. The computation of j within the for loop requires i to be type u32, so I convert it. Now, I have to change the type of i and j again to get the vector items.
I would like to avoid this constant back-and-forth conversion, is there an alternate way to do this in Rust? Thank you for your help.
fn compute(dots: Vec, N: u32) -> f32 {
let mut j: u32;
let mut value: f32 = 0.0;
for i in 0..N {
j = (i as u32 + 1) % N;
value += dots[i as usize].a * dots[j as usize].b;
value -= dots[i as usize].b * dots[j as usize].a;
}
return value
}

Either change the function signature to use N: usize, or, if you can't do that, just let M = N as usize and loop over 0..M (the loop variable will then have type usize).
Be aware that in real code, you need to be sure that usize is at least as wide as u32 if you opt for the conversion. If you cannot assure that, use try_into instead of as to convert.

What does this Rust Closure argument syntax mean?

I modified code found on the internet to create a function that obtains the statistical mode of any Hashable type that implements Eq, but I do not understand some of the syntax. Here is the function:
use std::hash::Hash;
use std::collections::HashMap;
pub fn mode<'a, I, T>(items: I) -> &'a T
where I: IntoIterator<Item = &'a T>,
T: Hash + Clone + Eq, {
let mut occurrences: HashMap<&T, usize> = HashMap::new();
for value in items.into_iter() {
*occurrences.entry(value).or_insert(0) += 1;
}
occurrences
.into_iter()
.max_by_key(|&(_, count)| count)
.map(|(val, _)| val)
.expect("Cannot compute the mode of zero items")
}
(I think requiring Clone may be overkill.)
The syntax I do not understand is in the closure passed to map_by_key:
|&(_, count)| count
What is the &(_, count) doing? I gather the underscore means I can ignore that parameter. Is this some sort of destructuring of a tuple in a parameter list? Does this make count take the reference of the tuple's second item?

.max_by_key(|&(_, count)| count) is equivalent to .max_by_key(f) where f is this:
fn f<T>(t: &(T, usize)) -> usize {
(*t).1
}
f() could also be written using pattern matching, like this:
fn f2<T>(&(_, count): &(T, usize)) -> usize {
count
}
And f2() is much closer to the first closure you're asking about.
The second closure is essentially the same, except there is no reference slightly complicating matters.

How do I perform operations on different numeric types while computing the average in an idiomatic Rust manner?

I tried to implement a small module where I calculate the mean of a vector:
pub mod vector_calculations {
pub fn mean(vec: &Vec<i32>) -> f32 {
let mut sum: f32 = 0.0;
for el in vec.iter() {
sum = el + sum;
}
sum / vec.len()
}
}
As far as I can tell from the compiler error, there are two problems with my code:
error[E0277]: the trait bound `&i32: std::ops::Add<f32>` is not satisfied
--> src/main.rs:6:22
|
6 | sum = el + sum;
| ^ no implementation for `&i32 + f32`
|
= help: the trait `std::ops::Add<f32>` is not implemented for `&i32`
error[E0277]: the trait bound `f32: std::ops::Div<usize>` is not satisfied
--> src/main.rs:9:13
|
9 | sum / vec.len()
| ^ no implementation for `f32 / usize`
|
= help: the trait `std::ops::Div<usize>` is not implemented for `f32`
I'm trying to add a &i32 with a f32 and I'm trying to divide a f32 with an usize.
I could solve the second error by changing the last line to:
sum / (vec.len() as f32)
Is this is actually how a Rust programmer would do this?
Furthermore, I don't really know how to solve the first error. What has to be done and why?

Yes, dereferencing values and converting numeric types is normal in Rust. These conversions help the programmer recognize that edge cases are possible. As loganfsmyth points out:
An i32 can hold values greater than f32 can represent accurately
Unfortunately, the compiler can't tell what's "correct" for your case, so you still have to be on guard.
For what it's worth, I'd write your current implementation using Iterator::sum:
fn mean(items: &[i32]) -> f64 {
let sum: f64 = items.iter().map(|&v| v as f64).sum();
sum / (items.len() as f64)
}
You should also probably handle the case where the input is empty to avoid dividing by zero:
fn mean(items: &[i32]) -> Option<f64> {
let len = items.len();
if len == 0 {
None
} else {
let sum: f64 = items.iter().map(|&v| v as f64).sum();
Some(sum / (len as f64))
}
}
Using the method from What is a good solution for calculating an average where the sum of all values exceeds a double's limits?, but made a bit more iterator-heavy:
fn mean2(ary: &[i32]) -> f64 {
ary.iter().enumerate().fold(0.0, |avg, (i, &x)| {
avg + ((x as f64 - avg) / (i + 1) as f64)
})
}
See also:
Why is it discouraged to accept a reference to a String (&String) or Vec (&Vec) as a function argument?

.iter() returns an &i32 and Rust does not automatically dereference for type conversions — you are currently trying to change the pointer (&) instead of changing what it's pointing to.
Changing your code to look like this is the simplest way to make it work:
pub mod vector_calculations {
pub fn mean(vec: &Vec<i32>) -> f32 {
let mut sum: f32 = 0.0;
for el in vec.iter() {
sum = *el as f32 + sum; // first dereference the pointer, than cast to f32
}
sum / vec.len() as f32 // cast to f32
}
}
But there are some ways to improve this kind of code:
pub mod vector_calculations {
pub fn mean(vec: &[i32]) -> f32 { // accept a slice instead of a vector
// it now allows arrays, slices, and vectors
// but now you can't add or remove items
// during this function call.
let mut sum: i32 = 0; // as the sum is still a whole number, changing the type
// should make it slightly easier to understand.
for el in vec.iter() {
sum = el + sum; // now this works without changing the type of el
// you don't even need to dereference el anymore
// as Rust does it automatically.
}
sum as f32 / vec.len() as f32 // now you need to cast to f32 twice at the end
}
}

How to allow function to work with integers or floats?

I found a function to compute a mean and have been playing with it. The code snippet below runs, but if the data inside the input changes from a float to an int an error occurs. How do I get this to work with floats and integers?
use std::borrow::Borrow;
fn mean(arr: &mut [f64]) -> f64 {
let mut i = 0.0;
let mut mean = 0.0;
for num in arr {
i += 1.0;
mean += (num.borrow() - mean) / i;
}
mean
}
fn main() {
let val = mean(&mut vec![4.0, 5.0, 3.0, 2.0]);
println!("The mean is {}", val);
}

The code in the question doesn't compile because f64 does not have a borrow() method. Also, the slice it accepts doesn't need to be mutable since we are not changing it. Here is a modified version that compiles and works:
fn mean(arr: &[f64]) -> f64 {
let mut i = 0.0;
let mut mean = 0.0;
for &num in arr {
i += 1.0;
mean += (num - mean) / i;
}
mean
}
We specify &num when looping over arr, so that the type of num is f64 rather than a reference to f64. This snippet would work with both, but omitting it would break the generic version.
For the same function to accept floats and integers alike, its parameter needs to be generic. Ideally we'd like it to accept any type that can be converted into f64, including f32 or user-defined types that defin such a conversion. Something like this:
fn mean<T>(arr: &[T]) -> f64 {
let mut i = 0.0;
let mut mean = 0.0;
for &num in arr {
i += 1.0;
mean += (num as f64 - mean) / i;
}
mean
}
This doesn't compile because x as f64 is not defined for x of an arbitry type. Instead, we need a trait bound on T that defines a way to convert T values to f64. This is exactly the purpose of the Into trait; every type T that implements Into<U> defines an into(self) -> U method. Specifying T: Into<f64> as the trait bound gives us the into() method that returns an f64.
We also need to request T to be Copy, to prevent reading the value from the array to "consume" the value, i.e. attempt moving it out of the array. Since primitive numbers such as integers implement Copy, this is ok for us. Working code then looks like this:
fn mean<T: Into<f64> + Copy>(arr: &[T]) -> f64 {
let mut i = 0.0;
let mut mean = 0.0;
for &num in arr {
i += 1.0;
mean += (num.into() - mean) / i;
}
mean
}
fn main() {
let val1 = mean(&vec![4.0, 5.0, 3.0, 2.0]);
let val2 = mean(&vec![4, 5, 3, 2]);
println!("The means are {} and {}", val1, val2);
}
Note that this will only work for types that define lossless conversion to f64. Thus it will work for u32, i32 (as in the above example) and smaller integer types, but it won't accept for example a vector of i64 or u64, which cannot be losslessly converted to f64.
Also note that this problem lends nicely to functional programming idioms such as enumerate() and fold(). Although outside the scope of this already longish answer, writing out such an implementation is an exercise hard to resist.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to sum a slice of bytes reducing the possibility of overflow - string

Related

&[T] changed to Vec<&T>

Datatype for indexing vector

What does this Rust Closure argument syntax mean?

How do I perform operations on different numeric types while computing the average in an idiomatic Rust manner?

How to allow function to work with integers or floats?

Categories

Resources