How to do a binary search on a Vec of floats? - rust

If you have a Vec<u32> you would use the slice::binary_search method.
For reasons I don't understand, f32 and f64 do not implement Ord. Since the primitive types are from the standard library, you cannot implement Ord on them yourself, so it does not appear you can use this method.
How can you effectively do this?
Do I really have to wrap f64 in a wrapper struct and implement Ord on it? It seems extremely painful to have to do this, and involves a great deal of transmute to cast blocks of data back and forth unsafely for effectively no reason.

for reasons I don't understand, f32 and f64 do not implement Ord.
Because floating point is hard! The short version is that floating point numbers have a special value NaN - Not a Number. The IEEE spec for floating point numbers states that 1 < NaN, 1 > NaN, and NaN == NaN are all false.
Ord says:
Trait for types that form a total order.
This means that the comparisons need to have totality:
a ≤ b or b ≤ a
but we just saw that floating points do not have this property.
So yes, you will need to create a wrapper type that somehow deals with comparing the large number of NaN values. Maybe your case you can just assert that the float value is never NaN and then call out to the regular PartialOrd trait. Here's an example:
use std::cmp::Ordering;
#[derive(PartialEq,PartialOrd)]
struct NonNan(f64);
impl NonNan {
fn new(val: f64) -> Option<NonNan> {
if val.is_nan() {
None
} else {
Some(NonNan(val))
}
}
}
impl Eq for NonNan {}
impl Ord for NonNan {
fn cmp(&self, other: &NonNan) -> Ordering {
self.partial_cmp(other).unwrap()
}
}
fn main() {
let mut v: Vec<_> = [2.0, 1.0, 3.0].iter().map(|v| NonNan::new(*v).unwrap()).collect();
v.sort();
let r = v.binary_search(&NonNan::new(2.0).unwrap());
println!("{:?}", r);
}

One of the slice methods is binary_search_by, which you could use. f32/f64 implement PartialOrd, so if you know they can never be NaN, you can unwrap the result of partial_cmp:
fn main() {
let values = [1.0, 2.0, 3.0, 4.0, 5.0];
let location = values.binary_search_by(|v| {
v.partial_cmp(&3.14).expect("Couldn't compare values")
});
match location {
Ok(i) => println!("Found at {}", i),
Err(i) => println!("Not found, could be inserted at {}", i),
}
}

A built-in total-ordering comparison method for floats named .total_cmp() is now stable, as of Rust 1.62.0. This implements that total ordering defined in IEEE 754, with every possible f64 bit value being sorted distinctly, including positive and negative zero, and all of the possible NaNs.
Floats still won't implement Ord, so they won't be directly sortable, but the boilerplate has been cut down to a single line, without any external imports or chance of panicking:
fn main() {
let mut v: Vec<f64> = vec![2.0, 2.5, -0.5, 1.0, 1.5];
v.sort_by(f64::total_cmp);
let target = 1.25;
let result = v.binary_search_by(|probe| probe.total_cmp(&target));
match result {
Ok(index) => {
println!("Found target {target} at index {index}.");
}
Err(index) => {
println!("Did not find target {target} (expected index was {index}).");
}
}
}

If you are sure that your floating point values will never be NaN, you can express this semantic with the wrappers in decorum. Specifically, the type Ordered implements Ord and panics whenever the program tries to do something invalid:
use decorum::Ordered;
fn foo() {
let ordered = Ordered<f32>::from_inner(10.);
let normal = ordered.into()
}

https://github.com/emerentius/ord_subset implements a ord_subset_binary_search() method that you can use for this.
from their README:
let mut s = [5.0, std::f64::NAN, 3.0, 2.0];
s.ord_subset_sort();
assert_eq!(&s[0..3], &[2.0, 3.0, 5.0]);
assert_eq!(s.ord_subset_binary_search(&5.0), Ok(2));
assert_eq!(s.iter().ord_subset_max(), Some(&5.0));
assert_eq!(s.iter().ord_subset_min(), Some(&2.0));

Related

How to calculate u64 modulus u8 in Rust? [duplicate]

Editor's note: This question is from a version of Rust prior to 1.0 and references some items that are not present in Rust 1.0. The answers still contain valuable information.
What's the idiomatic way to convert from (say) a usize to a u32?
For example, casting using 4294967295us as u32 works and the Rust 0.12 reference docs on type casting say
A numeric value can be cast to any numeric type. A raw pointer value can be cast to or from any integral type or raw pointer type. Any other cast is unsupported and will fail to compile.
but 4294967296us as u32 will silently overflow and give a result of 0.
I found ToPrimitive and FromPrimitive which provide nice functions like to_u32() -> Option<u32>, but they're marked as unstable:
#[unstable(feature = "core", reason = "trait is likely to be removed")]
What's the idiomatic (and safe) way to convert between numeric (and pointer) types?
The platform-dependent size of isize / usize is one reason why I'm asking this question - the original scenario was I wanted to convert from u32 to usize so I could represent a tree in a Vec<u32> (e.g. let t = Vec![0u32, 0u32, 1u32], then to get the grand-parent of node 2 would be t[t[2us] as usize]), and I wondered how it would fail if usize was less than 32 bits.
Converting values
From a type that fits completely within another
There's no problem here. Use the From trait to be explicit that there's no loss occurring:
fn example(v: i8) -> i32 {
i32::from(v) // or v.into()
}
You could choose to use as, but it's recommended to avoid it when you don't need it (see below):
fn example(v: i8) -> i32 {
v as i32
}
From a type that doesn't fit completely in another
There isn't a single method that makes general sense - you are asking how to fit two things in a space meant for one. One good initial attempt is to use an Option — Some when the value fits and None otherwise. You can then fail your program or substitute a default value, depending on your needs.
Since Rust 1.34, you can use TryFrom:
use std::convert::TryFrom;
fn example(v: i32) -> Option<i8> {
i8::try_from(v).ok()
}
Before that, you'd have to write similar code yourself:
fn example(v: i32) -> Option<i8> {
if v > std::i8::MAX as i32 {
None
} else {
Some(v as i8)
}
}
From a type that may or may not fit completely within another
The range of numbers isize / usize can represent changes based on the platform you are compiling for. You'll need to use TryFrom regardless of your current platform.
See also:
How do I convert a usize to a u32 using TryFrom?
Why is type conversion from u64 to usize allowed using `as` but not `From`?
What as does
but 4294967296us as u32 will silently overflow and give a result of 0
When converting to a smaller type, as just takes the lower bits of the number, disregarding the upper bits, including the sign:
fn main() {
let a: u16 = 0x1234;
let b: u8 = a as u8;
println!("0x{:04x}, 0x{:02x}", a, b); // 0x1234, 0x34
let a: i16 = -257;
let b: u8 = a as u8;
println!("0x{:02x}, 0x{:02x}", a, b); // 0xfeff, 0xff
}
See also:
What is the difference between From::from and as in Rust?
About ToPrimitive / FromPrimitive
RFC 369, Num Reform, states:
Ideally [...] ToPrimitive [...] would all be removed in favor of a more principled way of working with C-like enums
In the meantime, these traits live on in the num crate:
ToPrimitive
FromPrimitive

How to allow function to work with integers or floats?

I found a function to compute a mean and have been playing with it. The code snippet below runs, but if the data inside the input changes from a float to an int an error occurs. How do I get this to work with floats and integers?
use std::borrow::Borrow;
fn mean(arr: &mut [f64]) -> f64 {
let mut i = 0.0;
let mut mean = 0.0;
for num in arr {
i += 1.0;
mean += (num.borrow() - mean) / i;
}
mean
}
fn main() {
let val = mean(&mut vec![4.0, 5.0, 3.0, 2.0]);
println!("The mean is {}", val);
}
The code in the question doesn't compile because f64 does not have a borrow() method. Also, the slice it accepts doesn't need to be mutable since we are not changing it. Here is a modified version that compiles and works:
fn mean(arr: &[f64]) -> f64 {
let mut i = 0.0;
let mut mean = 0.0;
for &num in arr {
i += 1.0;
mean += (num - mean) / i;
}
mean
}
We specify &num when looping over arr, so that the type of num is f64 rather than a reference to f64. This snippet would work with both, but omitting it would break the generic version.
For the same function to accept floats and integers alike, its parameter needs to be generic. Ideally we'd like it to accept any type that can be converted into f64, including f32 or user-defined types that defin such a conversion. Something like this:
fn mean<T>(arr: &[T]) -> f64 {
let mut i = 0.0;
let mut mean = 0.0;
for &num in arr {
i += 1.0;
mean += (num as f64 - mean) / i;
}
mean
}
This doesn't compile because x as f64 is not defined for x of an arbitry type. Instead, we need a trait bound on T that defines a way to convert T values to f64. This is exactly the purpose of the Into trait; every type T that implements Into<U> defines an into(self) -> U method. Specifying T: Into<f64> as the trait bound gives us the into() method that returns an f64.
We also need to request T to be Copy, to prevent reading the value from the array to "consume" the value, i.e. attempt moving it out of the array. Since primitive numbers such as integers implement Copy, this is ok for us. Working code then looks like this:
fn mean<T: Into<f64> + Copy>(arr: &[T]) -> f64 {
let mut i = 0.0;
let mut mean = 0.0;
for &num in arr {
i += 1.0;
mean += (num.into() - mean) / i;
}
mean
}
fn main() {
let val1 = mean(&vec![4.0, 5.0, 3.0, 2.0]);
let val2 = mean(&vec![4, 5, 3, 2]);
println!("The means are {} and {}", val1, val2);
}
Note that this will only work for types that define lossless conversion to f64. Thus it will work for u32, i32 (as in the above example) and smaller integer types, but it won't accept for example a vector of i64 or u64, which cannot be losslessly converted to f64.
Also note that this problem lends nicely to functional programming idioms such as enumerate() and fold(). Although outside the scope of this already longish answer, writing out such an implementation is an exercise hard to resist.

How do I compare a vector against a reversed version of itself?

Why won't this compile?
fn isPalindrome<T>(v: Vec<T>) -> bool {
return v.reverse() == v;
}
I get
error[E0308]: mismatched types
--> src/main.rs:2:25
|
2 | return v.reverse() == v;
| ^ expected (), found struct `std::vec::Vec`
|
= note: expected type `()`
found type `std::vec::Vec<T>`
Since you only need to look at the front half and back half, you can use the DoubleEndedIterator trait (methods .next() and .next_back()) to look at pairs of front and back elements this way:
/// Determine if an iterable equals itself reversed
fn is_palindrome<I>(iterable: I) -> bool
where
I: IntoIterator,
I::Item: PartialEq,
I::IntoIter: DoubleEndedIterator,
{
let mut iter = iterable.into_iter();
while let (Some(front), Some(back)) = (iter.next(), iter.next_back()) {
if front != back {
return false;
}
}
true
}
(run in playground)
This version is a bit more general, since it supports any iterable that is double ended, for example slice and chars iterators.
It only examines each element once, and it automatically skips the remaining middle element if the iterator was of odd length.
Read up on the documentation for the function you are using:
Reverse the order of elements in a slice, in place.
Or check the function signature:
fn reverse(&mut self)
The return value of the method is the unit type, an empty tuple (). You can't compare that against a vector.
Stylistically, Rust uses 4 space indents, snake_case identifiers for functions and variables, and has an implicit return at the end of blocks. You should adjust to these conventions in a new language.
Additionally, you should take a &[T] instead of a Vec<T> if you are not adding items to the vector.
To solve your problem, we will use iterators to compare the slice. You can get forward and backward iterators of a slice, which requires a very small amount of space compared to reversing the entire array. Iterator::eq allows you to do the comparison succinctly.
You also need to state that the T is comparable against itself, which requires Eq or PartialEq.
fn is_palindrome<T>(v: &[T]) -> bool
where
T: Eq,
{
v.iter().eq(v.iter().rev())
}
fn main() {
println!("{}", is_palindrome(&[1, 2, 3]));
println!("{}", is_palindrome(&[1, 2, 1]));
}
If you wanted to do the less-space efficient version, you have to allocate a new vector yourself:
fn is_palindrome<T>(v: &[T]) -> bool
where
T: Eq + Clone,
{
let mut reverse = v.to_vec();
reverse.reverse();
reverse == v
}
fn main() {
println!("{}", is_palindrome(&[1, 2, 3]));
println!("{}", is_palindrome(&[1, 2, 1]));
}
Note that we are now also required to Clone the items in the vector, so we add that trait bound to the method.

Alternative to f32 and f64 that implements core::cmp::Ord [duplicate]

If you have a Vec<u32> you would use the slice::binary_search method.
For reasons I don't understand, f32 and f64 do not implement Ord. Since the primitive types are from the standard library, you cannot implement Ord on them yourself, so it does not appear you can use this method.
How can you effectively do this?
Do I really have to wrap f64 in a wrapper struct and implement Ord on it? It seems extremely painful to have to do this, and involves a great deal of transmute to cast blocks of data back and forth unsafely for effectively no reason.
for reasons I don't understand, f32 and f64 do not implement Ord.
Because floating point is hard! The short version is that floating point numbers have a special value NaN - Not a Number. The IEEE spec for floating point numbers states that 1 < NaN, 1 > NaN, and NaN == NaN are all false.
Ord says:
Trait for types that form a total order.
This means that the comparisons need to have totality:
a ≤ b or b ≤ a
but we just saw that floating points do not have this property.
So yes, you will need to create a wrapper type that somehow deals with comparing the large number of NaN values. Maybe your case you can just assert that the float value is never NaN and then call out to the regular PartialOrd trait. Here's an example:
use std::cmp::Ordering;
#[derive(PartialEq,PartialOrd)]
struct NonNan(f64);
impl NonNan {
fn new(val: f64) -> Option<NonNan> {
if val.is_nan() {
None
} else {
Some(NonNan(val))
}
}
}
impl Eq for NonNan {}
impl Ord for NonNan {
fn cmp(&self, other: &NonNan) -> Ordering {
self.partial_cmp(other).unwrap()
}
}
fn main() {
let mut v: Vec<_> = [2.0, 1.0, 3.0].iter().map(|v| NonNan::new(*v).unwrap()).collect();
v.sort();
let r = v.binary_search(&NonNan::new(2.0).unwrap());
println!("{:?}", r);
}
One of the slice methods is binary_search_by, which you could use. f32/f64 implement PartialOrd, so if you know they can never be NaN, you can unwrap the result of partial_cmp:
fn main() {
let values = [1.0, 2.0, 3.0, 4.0, 5.0];
let location = values.binary_search_by(|v| {
v.partial_cmp(&3.14).expect("Couldn't compare values")
});
match location {
Ok(i) => println!("Found at {}", i),
Err(i) => println!("Not found, could be inserted at {}", i),
}
}
A built-in total-ordering comparison method for floats named .total_cmp() is now stable, as of Rust 1.62.0. This implements that total ordering defined in IEEE 754, with every possible f64 bit value being sorted distinctly, including positive and negative zero, and all of the possible NaNs.
Floats still won't implement Ord, so they won't be directly sortable, but the boilerplate has been cut down to a single line, without any external imports or chance of panicking:
fn main() {
let mut v: Vec<f64> = vec![2.0, 2.5, -0.5, 1.0, 1.5];
v.sort_by(f64::total_cmp);
let target = 1.25;
let result = v.binary_search_by(|probe| probe.total_cmp(&target));
match result {
Ok(index) => {
println!("Found target {target} at index {index}.");
}
Err(index) => {
println!("Did not find target {target} (expected index was {index}).");
}
}
}
If you are sure that your floating point values will never be NaN, you can express this semantic with the wrappers in decorum. Specifically, the type Ordered implements Ord and panics whenever the program tries to do something invalid:
use decorum::Ordered;
fn foo() {
let ordered = Ordered<f32>::from_inner(10.);
let normal = ordered.into()
}
https://github.com/emerentius/ord_subset implements a ord_subset_binary_search() method that you can use for this.
from their README:
let mut s = [5.0, std::f64::NAN, 3.0, 2.0];
s.ord_subset_sort();
assert_eq!(&s[0..3], &[2.0, 3.0, 5.0]);
assert_eq!(s.ord_subset_binary_search(&5.0), Ok(2));
assert_eq!(s.iter().ord_subset_max(), Some(&5.0));
assert_eq!(s.iter().ord_subset_min(), Some(&2.0));

What type should I use for a 2-dimensional array?

What is wrong with the type of a here?
fn foo(a: &[&[f64]], x: &[f64]) {
for i in 0..3 {
for j in 0..4 {
println!("{}", a[i][j]);
}
}
}
fn main() {
let A: [[f64; 4]; 3] = [
[1.1, -0.2, 0.1, 1.6],
[0.1, -1.2, -0.2, 2.3],
[0.2, -0.1, 1.1, 1.5],
];
let mut X: [f64; 3] = [0.0; 3];
foo(&A, &X);
}
I get the compilation failure:
error[E0308]: mismatched types
--> src/main.rs:17:9
|
17 | foo(&A, &X);
| ^^ expected slice, found array of 3 elements
|
= note: expected type `&[&[f64]]`
found type `&[[f64; 4]; 3]`
Arrays are different types from slices. Notably, arrays have a fixed size, known at compile time. Slices have a fixed size, but known only at run time.
I see two straight-forward choices here (see Levans answer for another). The first is to change your function to only accept references to arrays (or the whole array, if you can copy it or don't mind giving up ownership):
fn foo(a: &[[f64; 4]; 3], x: &[f64; 3]) {
for i in 0..3 {
for j in 0..4 {
println!("{}", a[i][j]);
}
}
}
fn main() {
let a = [
[1.1, -0.2, 0.1, 1.6],
[0.1, -1.2, -0.2, 2.3],
[0.2, -0.1, 1.1, 1.5],
];
let x = [0.0; 3];
foo(&a, &x);
}
The other easy change is to make your declaration into references:
fn foo(a: &[&[f64]], x: &[f64]) {
for i in 0..3 {
for j in 0..4 {
println!("{}", a[i][j]);
}
}
}
fn main() {
let a = [
&[1.1, -0.2, 0.1, 1.6][..],
&[0.1, -1.2, -0.2, 2.3][..],
&[0.2, -0.1, 1.1, 1.5][..],
];
let x = [0.0; 3];
foo(&a, &x);
}
Note that this second example, we can use the implicit coercion of a reference to an array to a slice, when we just pass in &a and &x. However, we cannot rely on that for the nested data in a. a has already been defined to be an array of arrays, and we can't change the element type.
Also a word of caution - you really should use the length method of the slice in your ranges, otherwise you can easily panic! if you walk off the end.
fn foo(a: &[&[f64]], x: &[f64]) {
for i in 0..a.len() {
let z = &a[i];
for j in 0..z.len() {
println!("{}", z[j]);
}
}
}
Other stylistic changes I made to meet the Rust style:
variables are snake_case
space after :
space after ;
space around =
space after ,
As an alternative to Shepmaster's good explanation on the mechanisms, there is actually another way to have your function accept any mix of arrays and slices (and even Vec): it involves using generics with the AsRef trait.
the idea is to write your function like this:
use std::convert::AsRef;
fn foo<S, T, U>(a: S, x: U)
where
T: AsRef<[f64]>,
S: AsRef<[T]>,
U: AsRef<[f64]>,
{
let slice_a = a.as_ref();
for i in 0..slice_a.len() {
let slice_aa = slice_a[i].as_ref();
for j in 0..slice_aa.len() {
println!("{}", slice_aa[j]);
}
}
}
This is quite a function, but is in fact quite simple: S must coerce to a &[T] via the AsRef trait, and T must coerce to &[f64] similarly. On the same way U must coerce to &[f64], but we do not necessarily have U == T !
This way, S can be an array of slices, an array of array, a Vec of arrays or of slices, an array of Vec... Any combination is possible as long as the types implement the AsRef trait.
Be careful though: the AsRef trait is only implemented for arrays up to the size 32.

Resources