How to format to other number bases besides decimal, hexadecimal? [duplicate] - rust

Currently I'm using the following code to return a number as a binary (base 2), octal (base 8), or hexadecimal (base 16) string.
fn convert(inp: u32, out: u32, numb: &String) -> Result<String, String> {
match isize::from_str_radix(numb, inp) {
Ok(a) => match out {
2 => Ok(format!("{:b}", a)),
8 => Ok(format!("{:o}", a)),
16 => Ok(format!("{:x}", a)),
10 => Ok(format!("{}", a)),
0 | 1 => Err(format!("No base lower than 2!")),
_ => Err(format!("printing in this base is not supported")),
},
Err(e) => Err(format!(
"Could not convert {} to a number in base {}.\n{:?}\n",
numb, inp, e
)),
}
}
Now I want to replace the inner match statement so I can return the number as an arbitrarily based string (e.g. base 3.) Are there any built-in functions to convert a number into any given radix, similar to JavaScript's Number.toString() method?

For now, you cannot do it using the standard library, but you can:
use my crate radix_fmt
or roll your own implementation:
fn format_radix(mut x: u32, radix: u32) -> String {
let mut result = vec![];
loop {
let m = x % radix;
x = x / radix;
// will panic if you use a bad radix (< 2 or > 36).
result.push(std::char::from_digit(m, radix).unwrap());
if x == 0 {
break;
}
}
result.into_iter().rev().collect()
}
fn main() {
assert_eq!(format_radix(1234, 10), "1234");
assert_eq!(format_radix(1000, 10), "1000");
assert_eq!(format_radix(0, 10), "0");
}

If you wanted to eke out a little more performance, you can create a struct and implement Display or Debug for it. This avoids allocating a String. For maximum over-engineering, you can also have a stack-allocated array instead of the Vec.
Here is Boiethios' answer with these changes applied:
struct Radix {
x: i32,
radix: u32,
}
impl Radix {
fn new(x: i32, radix: u32) -> Result<Self, &'static str> {
if radix < 2 || radix > 36 {
Err("Unnsupported radix")
} else {
Ok(Self { x, radix })
}
}
}
use std::fmt;
impl fmt::Display for Radix {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
let mut x = self.x;
// Good for binary formatting of `u128`s
let mut result = ['\0'; 128];
let mut used = 0;
let negative = x < 0;
if negative {
x*=-1;
}
let mut x = x as u32;
loop {
let m = x % self.radix;
x /= self.radix;
result[used] = std::char::from_digit(m, self.radix).unwrap();
used += 1;
if x == 0 {
break;
}
}
if negative {
write!(f, "-")?;
}
for c in result[..used].iter().rev() {
write!(f, "{}", c)?;
}
Ok(())
}
}
fn main() {
assert_eq!(Radix::new(1234, 10).to_string(), "1234");
assert_eq!(Radix::new(1000, 10).to_string(), "1000");
assert_eq!(Radix::new(0, 10).to_string(), "0");
}
This could still be optimized by:
creating an ASCII array instead of a char array
not zero-initializing the array
Since these avenues require unsafe or an external crate like arraybuf, I have not included them. You can see sample code in internal implementation details of the standard library.

Here is an extended solution based on the first comment which does not bind the parameter x to be a u32:
fn format_radix(mut x: u128, radix: u32) -> String {
let mut result = vec![];
loop {
let m = x % radix as u128;
x = x / radix as u128;
// will panic if you use a bad radix (< 2 or > 36).
result.push(std::char::from_digit(m as u32, radix).unwrap());
if x == 0 {
break;
}
}
result.into_iter().rev().collect()
}

This is faster than the other answer:
use std::char::from_digit;
fn encode(mut n: u32, r: u32) -> Option<String> {
let mut s = String::new();
loop {
if let Some(c) = from_digit(n % r, r) {
s.insert(0, c)
} else {
return None
}
n /= r;
if n == 0 {
break
}
}
Some(s)
}
Note I also tried these, but they were slower:
https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.push_front
https://doc.rust-lang.org/std/string/struct.String.html#method.push
https://doc.rust-lang.org/std/vec/struct.Vec.html#method.insert

Related

Convert a vector of u8 bytes into a rust_decimal

I am loading data from another language. Numbers can be very large and they are serialized as a byte array of u8s.
These are loaded into rust as a vec of u8s:
vec![1, 0, 0]
This represents 100. I also have a u32 to represent the cale.
I'm trying to load this into a rust_decimal, but am stuck.
measure_value.value -> a vec of u8
measure_value.scale -> a u32
let r_dec = rust_Decimal::????
This is the implementation I have so far, but it feels inelegant!
pub fn proto_to_decimal(input: &DecimalValueProto) -> Result<Decimal, String> {
let mut num = 0;
let mut power: i32 = (input.value.len() - 1)
.try_into()
.map_err(|_| "Failed to convert proto to decimal")?; //casting down from usize to i32 is failable
for digit in input.value.iter() {
let expansion: i128 = if power == 0 { expansion = *digit as i128 } else { expansion = (*digit as i128) * 10_i128.pow(power as u32) as i128 }
num += expansion;
power -= 1;
}
Ok(Decimal::from_i128_with_scale(num as i128, input.scale))
}

How to write a macro that splits a byte into a tuple of bits of user-specified count?

I would like to have macro splitting one byte into tuple with 2-8 u8 parts using bitreader crate.
I managed to achieve that by following code:
use bitreader::BitReader;
trait Tupleprepend<T> {
type ResultType;
fn prepend(self, t: T) -> Self::ResultType;
}
macro_rules! impl_tuple_prepend {
( () ) => {};
( ( $t0:ident $(, $types:ident)* ) ) => {
impl<$t0, $($types,)* T> Tupleprepend<T> for ($t0, $($types,)*) {
type ResultType = (T, $t0, $($types,)*);
fn prepend(self, t: T) -> Self::ResultType {
let ($t0, $($types,)*) = self;
(t, $t0, $($types,)*)
}
}
impl_tuple_prepend! { ($($types),*) }
};
}
impl_tuple_prepend! {
(_1, _2, _3, _4, _5, _6, _7, _8)
}
macro_rules! split_byte (
($reader:ident, $bytes:expr, $count:expr) => {{
($reader.read_u8($count).unwrap(),)
}};
($reader:ident, $bytes:expr, $count:expr, $($next_counts:expr),+) => {{
let head = split_byte!($reader, $bytes, $count);
let tail = split_byte!($reader, $bytes, $($next_counts),+);
tail.prepend(head.0)
}};
($bytes:expr $(, $count:expr)* ) => {{
let mut reader = BitReader::new($bytes);
split_byte!(reader, $bytes $(, $count)+)
}};
);
Now I can use this code as I would like to:
let buf: &[u8] = &[0x72];
let (bit1, bit2, bits3to8) = split_byte!(&buf, 1, 1, 6);
Is there a way to avoid using Tupleprepend trait and create only 1 tuple instead of 8 in the worst scenario?
Because the number of bit widths directly corresponds to the number of returned values, I'd solve the problem using generics and arrays instead. The macro only exists to remove the typing of the [], which I don't really think is worth it.
fn split_byte<A>(b: u8, bit_widths: A) -> A
where
A: Default + std::ops::IndexMut<usize, Output = u8>,
for<'a> &'a A: IntoIterator<Item = &'a u8>,
{
let mut result = A::default();
let mut start = 0;
for (idx, &width) in bit_widths.into_iter().enumerate() {
let shifted = b >> (8 - width - start);
let mask = (0..width).fold(0, |a, _| (a << 1) | 1);
result[idx] = shifted & mask;
start += width;
}
result
}
macro_rules! split_byte {
($b:expr, $($w:expr),+) => (split_byte($b, [$($w),+]));
}
fn main() {
let [bit1, bit2, bits3_to_8] = split_byte!(0b1010_1010, 1, 1, 6);
assert_eq!(bit1, 0b1);
assert_eq!(bit2, 0b0);
assert_eq!(bits3_to_8, 0b10_1010);
}
See also:
How does for<> syntax differ from a regular lifetime bound?
How to write a trait bound for adding two references of a generic type?
How do I write the lifetimes for references in a type constraint when one of them is a local reference?
If it's ok to target nightly Rust, I'd use the unstable min_const_generics feature:
#![feature(min_const_generics)]
fn split_byte<const N: usize>(b: u8, bit_widths: [u8; N]) -> [u8; N] {
let mut result = [0; N];
let mut start = 0;
for (idx, &width) in bit_widths.iter().enumerate() {
let shifted = b >> (8 - width - start);
let mask = (0..width).fold(0, |a, _| (a << 1) | 1);
result[idx] = shifted & mask;
start += width;
}
result
}
macro_rules! split_byte {
($b:expr, $($w:expr),+) => (split_byte($b, [$($w),+]));
}
fn main() {
let [bit1, bit2, bits3_to_8] = split_byte!(0b1010_1010, 1, 1, 6);
assert_eq!(bit1, 0b1);
assert_eq!(bit2, 0b0);
assert_eq!(bits3_to_8, 0b10_1010);
}
See also:
Is it possible to control the size of an array using the type parameter of a generic?

Can I perform binary tree search with the standard library without wrapping the float type and abusing the BTreeMap?

I would like to find the first element which is greater than a limit from an ordered collection. While iteration over it is always an option, I need a faster one. Currently, I came up with a solution like this but it feels a little hacky:
use std::cmp::Ordering;
use std::collections::BTreeMap;
use std::ops::Bound::{Included, Unbounded};
#[derive(Debug)]
struct FloatWrapper(f32);
impl Eq for FloatWrapper {}
impl PartialEq for FloatWrapper {
fn eq(&self, other: &Self) -> bool {
(self.0 - other.0).abs() < 1.17549435e-36f32
}
}
impl Ord for FloatWrapper {
fn cmp(&self, other: &Self) -> Ordering {
if (self.0 - other.0).abs() < 1.17549435e-36f32 {
Ordering::Equal
} else if self.0 - other.0 > 0.0 {
Ordering::Greater
} else if self.0 - other.0 < 0.0 {
Ordering::Less
} else {
Ordering::Equal
}
}
}
impl PartialOrd for FloatWrapper {
fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
Some(self.cmp(other))
}
}
The wrapper around the float is not nice even that I am sure that there will be no NaNs
The Range is also unnecessary since I want a single element.
Is there a better way of achieving a similar result using only Rust's standard library? I know that there are plenty of tree implementations but it feels like overkill.
After the suggestions in the answer to use the iterator I did a little benchmark with the following code:
fn main() {
let measure = vec![
10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190,
200,
];
let mut measured_binary = Vec::new();
let mut measured_iter = Vec::new();
let mut measured_vec = Vec::new();
for size in measure {
let mut ww = BTreeMap::new();
let mut what_found = Vec::new();
for _ in 0..size {
let now: f32 = thread_rng().gen_range(0.0, 1.0);
ww.insert(FloatWrapper(now), now);
}
let what_to_search: Vec<FloatWrapper> = (0..10000)
.map(|_| thread_rng().gen_range(0.0, 0.8))
.map(|x| FloatWrapper(x))
.collect();
let mut rez = 0;
for current in &what_to_search {
let now = Instant::now();
let m = find_one(&ww, current);
rez += now.elapsed().as_nanos();
what_found.push(m);
}
measured_binary.push(rez);
rez = 0;
for current in &what_to_search {
let now = Instant::now();
let m = find_two(&ww, current);
rez += now.elapsed().as_nanos();
what_found.push(m);
}
measured_iter.push(rez);
let ww_in_vec: Vec<(FloatWrapper, f32)> =
ww.iter().map(|(&key, &value)| (key, value)).collect();
rez = 0;
for current in &what_to_search {
let now = Instant::now();
let m = find_three(&ww_in_vec, current);
rez += now.elapsed().as_nanos();
what_found.push(m);
}
measured_vec.push(rez);
println!("{:?}", what_found);
}
println!("binary :{:?}", measured_binary);
println!("iter_map :{:?}", measured_iter);
println!("iter_vec :{:?}", measured_vec);
}
fn find_one(from_what: &BTreeMap<FloatWrapper, f32>, what: &FloatWrapper) -> f32 {
let v: Vec<f32> = from_what
.range((Included(what), (Unbounded)))
.take(1)
.map(|(_, &v)| v)
.collect();
*v.get(0).expect("we are in truble")
}
fn find_two(from_what: &BTreeMap<FloatWrapper, f32>, what: &FloatWrapper) -> f32 {
from_what
.iter()
.skip_while(|(i, _)| *i < what) // Skipping all elements before it
.take(1) // Reducing the iterator to 1 element
.map(|(_, &v)| v) // Getting its value, dereferenced
.next()
.expect("we are in truble") // Our
}
fn find_three(from_what: &Vec<(FloatWrapper, f32)>, what: &FloatWrapper) -> f32 {
*from_what
.iter()
.skip_while(|(i, _)| i < what) // Skipping all elements before it
.take(1) // Reducing the iterator to 1 element
.map(|(_, v)| v) // Getting its value, dereferenced
.next()
.expect("we are in truble") // Our
}
The key takeaway for me is that it is worth to use the binary search after ~50 elements. In my case with 30000 elements means 200x speedup (at least based on this microbenchmark).
You said you wanted a std-only solution, but this is a common enough problem, so here's a solution using the crate ordered-float:
Cargo.toml
[dependencies]
ordered-float = "1.0"
main.rs
use ordered_float::OrderedFloat; // 1.0.2
use std::collections::BTreeMap;
fn main() {
let mut ww = BTreeMap::new();
ww.insert(OrderedFloat(1.0), "one");
ww.insert(OrderedFloat(2.0), "two");
ww.insert(OrderedFloat(3.0), "three");
ww.insert(OrderedFloat(4.0), "three");
let rez = ww.range(OrderedFloat(1.5)..).next().map(|(_, &v)| v);
println!("{:?}", rez);
}
prints
Some("two")
Now, isn't that nice and clean? If you want a less verbose syntax, I suggest wrapping the BTreeMap itself, so you can give it appropriately named methods that make sense for your application.
NaN behavior
Be aware that OrderedFloat may not behave the way you expect in the presence of NaNs:
NaN is sorted as greater than all other values and equal to itself, in contradiction with the IEEE standard.
Now that we've gone over and clarified the requirements a bit, there's a couple of bad news for you:
You're not getting away from the requirement to have a wrapping type. As I'm sure you've discovered, this is because no floating-point type implements Ord
You're also not getting away from a combinator of some sort
First, we're going to clear up your impl, as they both have shortfalls described in the comments. In the future, it may make sense to use the wrapper traits in eq-float, as they already implement all this. The implementations at fault are PartialEq and Ord, and they both break down on a few points. The new implementations:
impl Ord for FloatWrapper {
fn cmp(&self, other: &Self) -> Ordering {
self.0.partial_cmp(&other.0).unwrap_or_else(|| {
if self.0.is_nan() && !other.0.is_nan() {
Ordering::Less
} else if !self.0.is_nan() && other.0.is_nan() {
Ordering::Greater
} else {
Ordering::Equal
}
})
}
}
impl PartialEq for FloatWrapper {
fn eq(&self, other: &Self) -> bool {
if self.0.is_nan() && other.0.is_nan() {
true
} else {
self.0 == other.0
}
}
}
Nothing surprising, we're just abusing the fact that f32 implements PartialOrd for Ord and surfacing all other implementations on FloatWrapper itself.
Now, for the combinator. Your current combinator will force a range of elements to be stored temporarily in memory, to then discard one. We can do better by abusing the fact that iter() is a sorted iterator. So, we can skip while we search, and then take the first:
let mut first_element = ww.iter()
.skip_while(|(i, _)| *i < &FloatWrapper::new(1.5)) // Skipping all elements before it
.take(1) // Reducing the iterator to 1 element
.map(|(_, &v)| v) // Getting its value, dereferenced
.next(); // Our result
This yields a 10% speedup in low-element-count situations over your first implementation.

How to avoid cloning a big integer in rust

I used the num::BigUInt type to avoid integer overflows when calculating the factorial of a number.
However, I had to resort to using .clone() to pass rustc's borrow checker.
How can I refactor the factorial function to avoid cloning what could be large numbers many times?
use num::{BigUint, FromPrimitive, One};
fn main() {
for n in -2..33 {
let bign: Option<BigUint> = FromPrimitive::from_isize(n);
match bign {
Some(n) => println!("{}! = {}", n, factorial(n.clone())),
None => println!("Number must be non-negative: {}", n),
}
}
}
fn factorial(number: BigUint) -> BigUint {
if number < FromPrimitive::from_usize(2).unwrap() {
number
} else {
number.clone() * factorial(number - BigUint::one())
}
}
I tried to use a reference to BigUInt in the function definition but got some errors saying that BigUInt did not support references.
The first clone is easy to remove. You are trying to use n twice in the same expression, so don't use just one expression:
print!("{}! = ", n);
println!("{}", factorial(n));
is equivalent to println!("{}! = {}", n, factorial(n.clone())) but does not try to move n and use a reference to it at the same time.
The second clone can be removed by changing factorial not to be recursive:
fn factorial(mut number: BigUint) -> BigUint {
let mut result = BigUint::one();
let one = BigUint::one();
while number > one {
result *= &number;
number -= &one;
}
result
}
This might seem unidiomatic however. There is a range function, that you could use with for, however, it uses clone internally, defeating the point.
I don't think take a BigUint as parameter make sense for a factorial. u32 should be enough:
use num::{BigUint, One};
fn main() {
for n in 0..42 {
println!("{}! = {}", n, factorial(n));
}
}
fn factorial_aux(accu: BigUint, i: u32) -> BigUint {
if i > 1 {
factorial_aux(accu * i, i - 1)
}
else {
accu
}
}
fn factorial(n: u32) -> BigUint {
factorial_aux(BigUint::one(), n)
}
Or if you really want to keep BigUint:
use num::{BigUint, FromPrimitive, One, Zero};
fn main() {
for i in (0..42).flat_map(|i| FromPrimitive::from_i32(i)) {
print!("{}! = ", i);
println!("{}", factorial(i));
}
}
fn factorial_aux(accu: BigUint, i: BigUint) -> BigUint {
if !i.is_one() {
factorial_aux(accu * &i, i - 1u32)
} else {
accu
}
}
fn factorial(n: BigUint) -> BigUint {
if !n.is_zero() {
factorial_aux(BigUint::one(), n)
} else {
BigUint::one()
}
}
Both version doesn't do any clone.
If you use ibig::UBig instead of BigUint, those clones will be free, because ibig is optimized not to allocate memory from the heap for numbers this small.

Format/convert a number to a string in any base (including bases other than decimal or hexadecimal)

Currently I'm using the following code to return a number as a binary (base 2), octal (base 8), or hexadecimal (base 16) string.
fn convert(inp: u32, out: u32, numb: &String) -> Result<String, String> {
match isize::from_str_radix(numb, inp) {
Ok(a) => match out {
2 => Ok(format!("{:b}", a)),
8 => Ok(format!("{:o}", a)),
16 => Ok(format!("{:x}", a)),
10 => Ok(format!("{}", a)),
0 | 1 => Err(format!("No base lower than 2!")),
_ => Err(format!("printing in this base is not supported")),
},
Err(e) => Err(format!(
"Could not convert {} to a number in base {}.\n{:?}\n",
numb, inp, e
)),
}
}
Now I want to replace the inner match statement so I can return the number as an arbitrarily based string (e.g. base 3.) Are there any built-in functions to convert a number into any given radix, similar to JavaScript's Number.toString() method?
For now, you cannot do it using the standard library, but you can:
use my crate radix_fmt
or roll your own implementation:
fn format_radix(mut x: u32, radix: u32) -> String {
let mut result = vec![];
loop {
let m = x % radix;
x = x / radix;
// will panic if you use a bad radix (< 2 or > 36).
result.push(std::char::from_digit(m, radix).unwrap());
if x == 0 {
break;
}
}
result.into_iter().rev().collect()
}
fn main() {
assert_eq!(format_radix(1234, 10), "1234");
assert_eq!(format_radix(1000, 10), "1000");
assert_eq!(format_radix(0, 10), "0");
}
If you wanted to eke out a little more performance, you can create a struct and implement Display or Debug for it. This avoids allocating a String. For maximum over-engineering, you can also have a stack-allocated array instead of the Vec.
Here is Boiethios' answer with these changes applied:
struct Radix {
x: i32,
radix: u32,
}
impl Radix {
fn new(x: i32, radix: u32) -> Result<Self, &'static str> {
if radix < 2 || radix > 36 {
Err("Unnsupported radix")
} else {
Ok(Self { x, radix })
}
}
}
use std::fmt;
impl fmt::Display for Radix {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
let mut x = self.x;
// Good for binary formatting of `u128`s
let mut result = ['\0'; 128];
let mut used = 0;
let negative = x < 0;
if negative {
x*=-1;
}
let mut x = x as u32;
loop {
let m = x % self.radix;
x /= self.radix;
result[used] = std::char::from_digit(m, self.radix).unwrap();
used += 1;
if x == 0 {
break;
}
}
if negative {
write!(f, "-")?;
}
for c in result[..used].iter().rev() {
write!(f, "{}", c)?;
}
Ok(())
}
}
fn main() {
assert_eq!(Radix::new(1234, 10).to_string(), "1234");
assert_eq!(Radix::new(1000, 10).to_string(), "1000");
assert_eq!(Radix::new(0, 10).to_string(), "0");
}
This could still be optimized by:
creating an ASCII array instead of a char array
not zero-initializing the array
Since these avenues require unsafe or an external crate like arraybuf, I have not included them. You can see sample code in internal implementation details of the standard library.
Here is an extended solution based on the first comment which does not bind the parameter x to be a u32:
fn format_radix(mut x: u128, radix: u32) -> String {
let mut result = vec![];
loop {
let m = x % radix as u128;
x = x / radix as u128;
// will panic if you use a bad radix (< 2 or > 36).
result.push(std::char::from_digit(m as u32, radix).unwrap());
if x == 0 {
break;
}
}
result.into_iter().rev().collect()
}
This is faster than the other answer:
use std::char::from_digit;
fn encode(mut n: u32, r: u32) -> Option<String> {
let mut s = String::new();
loop {
if let Some(c) = from_digit(n % r, r) {
s.insert(0, c)
} else {
return None
}
n /= r;
if n == 0 {
break
}
}
Some(s)
}
Note I also tried these, but they were slower:
https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.push_front
https://doc.rust-lang.org/std/string/struct.String.html#method.push
https://doc.rust-lang.org/std/vec/struct.Vec.html#method.insert

Resources