Bitwise operations, comparing a u32 with a byte array - rust

Lets say I have the value 1025 as a byte array and the value 1030 as usize. How would I go about comparing if the byte array is bigger, lesser or equal without deserializing it?
I'm completely stuck, I assume the easisest way is to get the biggest bytes of the byte array, its position, then bitshift the u32 and see if any bits in the byte is set, if not the byte array is bigger.
In short I want to write some functions to be able to decide if a > b, a < b and a == b.
To use a code example
fn is_greater(a: &[u8], b: usize) -> bool {
// a is LE, so reverse and get the largest bytes
let c = a.iter()
.enumerate()
.rev()
.filter_map(|(i, b)| ( if *b != 0 { return Some((i, *b)); } else { None }))
.collect::<Vec<(usize, u8)>>();
for (i, be) in c {
let k = (b >> (i * 8)) & 255;
println!("{}, {}", be, k);
return be as usize > k
}
false
}
EDIT: Should have clarified, the byte array can be any integer, unsigned integer or float. Simply any integer bincode::serialize can serialize.
I also had in mind to avoid converting the byte array, comparison is supposed to be used on 100000 of byte arrays, so I assume bit operations is the preferred way.

No need for all those extra steps. The basic problem is to know if the integer encoded in the byte-array is little endian, big endian or native endian. Knowing that, you can use usize::from_??_bytes to convert a fixed-size array to an integer; use the TryFrom-trait to get the fixed-size array from the slice.
fn is_greater(b: &[u8], v: usize) -> Result<bool, std::array::TryFromSliceError> {
use std::convert::TryFrom;
Ok(usize::from_le_bytes(<[u8; 8]>::try_from(b)?) > v)
}
This function will return an error if the byte-slice is smaller than 8 bytes, in which case there is no way to construct a usize; you can also convert to u32 or even u16, upcast that to usize and then do the comparison. Also notice that this example uses from_le_bytes, assuming the bytes-slice contains an integer encoded as little endian.

Related

Getting the length of an int

I am trying to get the length (the number of digits when interpreted in decimal) of an int in rust. I found a way to do it, however am looking for method that comes from the primitive itself. This is what I have:
let num = 90.to_string();
println!("num: {}", num.chars().count())
// num: 2
I am looking at https://docs.rs/digits/0.3.3/digits/struct.Digits.html#method.length. is this a good candidate? How do I use it? Or are there other crates that does it for me?
One liners with less type conversion is the ideal solution I am looking for.
You could loop and check how often you can divide the number by 10 before it becomes a single digit.
Or in the other direction (because division is slower than multiplication), check how often you can multiply 10*10*...*10 until you reach the number:
fn length(n: u32, base: u32) -> u32 {
let mut power = base;
let mut count = 1;
while n >= power {
count += 1;
if let Some(new_power) = power.checked_mul(base) {
power = new_power;
} else {
break;
}
}
count
}
With nightly rust (or in the future, when the int_log feature is stabilized), you can use:
#![feature(int_log)]
n.checked_log10().unwrap_or(0) + 1
Here is a one-liner that doesn't require strings or floating point:
println!("num: {}", successors(Some(n), |&n| (n >= 10).then(|| n / 10)).count());
It simply counts the number of times the initial number needs to be divided by 10 in order to reach 0.
EDIT: the first version of this answer used iterate from the (excellent and highly recommended) itertools crate, but #trentcl pointed out that successors from the stdlib does the same. For reference, here is the version using iterate:
println!("num: {}", iterate(n, |&n| n / 10).take_while(|&n| n > 0).count().max(1));
Here's a (barely) one-liner that's faster than doing a string conversion, using std::iter stuff:
let some_int = 9834;
let decimal_places = (0..).take_while(|i| 10u64.pow(*i) <= some_int).count();
The first method below relies on the following formula, where a and b are the logarithmic bases.
log<a>( x ) = log<b>( x ) / log<b>( a )
log<a>( x ) = log<2>( x ) / log<2>( a ) // Substituting 2 for `b`.
The following function can be applied to finding the number of digits for bases that are a power of 2. This approach is very fast.
fn num_digits_base_pow2(n: u64, b: u32) -> u32
{
(63 - n.leading_zeros()) / (31 - b.leading_zeros()) + 1
}
The bits are counted for both n (the number we want to represent) and b (the base) to find their log2 floor values. Then the adjusted ratio of these values gives the ceiling log value in the desired base.
For a general purpose approach to finding the number of digits for arbitrary bases, the following should suffice.
fn num_digits(n: u64, b: u32) -> u32
{
(n as f64).log(b as f64).ceil() as u32
}
if num is signed:
let digits = (num.abs() as f64 + 0.1).log10().ceil() as u32;
A nice property of numbers that is always good to have in mind is that the number of digits required to write a number $x$ in base $n$ is actually $\lceil log_n(x + 1) \rceil$.
Therefore, one can simply write the following function (notice the cast from u32 to f32, since integers don't have a log function).
fn length(n: u32, base: u32) -> u32 {
let n = (n+1) as f32;
n.log(base as f32).ceil() as u32
}
You can easily adapt it for negative numbers. For floating point numbers this might be a bit (i.e. a lot) more tricky.
To take into account Daniel's comment about the pathological cases introduced by using f32, note that, with nightly Rust, integers have a logarithm method. (Notice that, imo, those are implementation details, and you should more focus on understanding the algorithm than the implementation.):
#![feature(int_log)]
fn length(n: u32, base: u32) -> u32 {
n.log(base) + 1
}

Convert array (or vector) of u16 (or u32, u64) to array of u8

I have no problem to do it for u16 to u8 using bit shifts and cast but how could I do it with an array of u16? Even ideally I would prefer to convert directly from vec to [u8]. What would be the most elegant way to do it?
&my_vector[..] // my vector converted to [u16] but I need [u8]
Was able to make it work thanks to #Aplet123 insight:
From Vector to bytes array
From Vec to [u8]
let mut my_u16_vec : Vec<u16> = Vec::new();
let my_u8_array = my_u16_vec.align_to::<u8>().1;
From bytes array back to Vector
From [u8] to Vec
let n = my_u16_vec.len() * 2;
let my_u16_vec_bis:Vec<u16> = (my_u8_array[..n].align_to::<u16>().1).to_vec();
Getting the bytes right
And then reverse bytes as values are written reversely in memory due to endianness:
for e in my_u16_vec_bis:Vec.iter() {
let value = e >> 8 | (e & 0xff) << 8;
}

How can I define a generic function that can return a given integer type?

I'd like to define a function that can return a number whose type is specified when the function is called. The function takes a buffer (Vec<u8>) and returns numeric value, e.g.
let byte = buf_to_num<u8>(&buf);
let integer = buf_to_num<u32>(&buf);
The buffer contains an ASCII string that represents a number, e.g. b"827", where each byte is the ASCII code of a digit.
This is my non-working code:
extern crate num;
use num::Integer;
use std::ops::{MulAssign, AddAssign};
fn buf_to_num<T: Integer + MulAssign + AddAssign>(buf: &Vec::<u8>) -> T {
let mut result : T;
for byte in buf {
result *= 10;
result += (byte - b'0');
}
result
}
I get mismatched type errors for both the addition and the multiplication lines (expected type T, found u32). So I guess my problem is how to tell the type system that T can be expressed in terms of a literal 10 or in terms of the result of (byte - b'0')?
Welcome to the joys of having to specify every single operation you're using as a generic. It's a pain, but it is worth.
You have two problems:
result *= 10; without a corresponding From<_> definition. This is because, when you specify "10", there is no way for the compiler to know what "10" as a T means - it knows primitive types, and any conversion you defined by implementing From<_> traits
You're mixing up two operations - coercion from a vector of characters to an integer, and your operation.
We need to make two assumptions for this:
We will require From<u32> so we can cap our numbers to u32
We will also clarify your logic and convert each u8 to char so we can use to_digit() to convert that to u32, before making use of From<u32> to get a T.
use std::ops::{MulAssign, AddAssign};
fn parse_to_i<T: From<u32> + MulAssign + AddAssign>(buf: &[u8]) -> T {
let mut buffer:T = (0 as u32).into();
for o in buf {
buffer *= 10.into();
buffer += (*o as char).to_digit(10).unwrap_or(0).into();
}
buffer
}
You can convince yourself of its behavior on the playground
The multiplication is resolved by force-casting the constant as u8, which makes it benefit from our requirement of From<u8> for T and allows the rust compiler to know we're not doing silly stuff.
The final change is to set result to have a default value of 0.
Let me know if this makes sense to you (or if it doesn't), and I'll be glad to elaborate further if there is a problem :-)

Can a BigInteger be truncated to an i32 in Rust?

In Java, intValue() gives back a truncated portion of the BigInteger instance. I wrote a similar program in Rust but it appears not to truncate:
extern crate num;
use num::bigint::{BigInt, RandBigInt};
use num::ToPrimitive;
fn main() {
println!("Hello, world!");
truncate_num(
BigInt::parse_bytes(b"423445324324324324234324", 10).unwrap(),
BigInt::parse_bytes(b"22447", 10).unwrap(),
);
}
fn truncate_num(num1: BigInt, num2: BigInt) -> i32 {
println!("Truncation of {} is {:?}.", num1, num1.to_i32());
println!("Truncation of {} is {:?}.", num2, num2.to_i32());
return 0;
}
The output I get from this is
Hello, world!
Truncation of 423445324324324324234324 is None.
Truncation of 22447 is Some(22447).
How can I achieve this in Rust? Should I try a conversion to String and then truncate manually? This would be my last resort.
Java's intValue() returns the lowest 32 bits of the integer. This could be done by a bitwise-AND operation x & 0xffffffff. A BigInt in Rust doesn't support bitwise manipulation, but you could first convert it to a BigUint which supports such operations.
fn truncate_biguint_to_u32(a: &BigUint) -> u32 {
use std::u32;
let mask = BigUint::from(u32::MAX);
(a & mask).to_u32().unwrap()
}
Converting BigInt to BigUint will be successful only when it is not negative. If the BigInt is negative (-x), we could find the lowest 32 bits of its absolute value (x), then negate the result.
fn truncate_bigint_to_u32(a: &BigInt) -> u32 {
use num_traits::Signed;
let was_negative = a.is_negative();
let abs = a.abs().to_biguint().unwrap();
let mut truncated = truncate_biguint_to_u32(&abs);
if was_negative {
truncated.wrapping_neg()
} else {
truncated
}
}
Demo
You may use truncate_bigint_to_u32(a) as i32 if you need a signed number.
There is also a to_signed_bytes_le() method with which you could extract the bytes and decode that into a primitive integer directly:
fn truncate_bigint_to_u32_slow(a: &BigInt) -> u32 {
let mut bytes = a.to_signed_bytes_le();
bytes.resize(4, 0);
bytes[0] as u32 | (bytes[1] as u32) << 8 | (bytes[2] as u32) << 16 | (bytes[3] as u32) << 24
}
This method is extremely slow compared to the above methods and I don't recommend using it.
There's no natural truncation of a big integer into a smaller one. Either it fits or you have to decide what value you want.
You could do this:
println!("Truncation of {} is {:?}.", num1, num1.to_i32().unwrap_or(-1));
or
println!("Truncation of {} is {:?}.", num1, num1.to_i32().unwrap_or(std::i32::MAX));
but your application logic should probably dictate what's the desired behavior when the returned option contains no value.

Extract 7 bits signed integer from u8 byte

I am using the Human Interface Device protocol to get data from an external device. The library I'm using returns an array of bytes ([u8; 64]) which I want to extract an i7 (which will be i8 in Rust) from one byte.
The byte I want to manipulate has two different pieces of information in it:
1 bit for something
the 7 other bits (which I have to decode as a signed integer) for another thing.
Do you know what can I do to achieve this?
Using the crate bitreader I have been able to properly decode the signed integer of 7 bits.
let mut bit_reader = BitReader::new(buffer);
let first_useless_bit: u8 = bit_reader.read_u8(1).unwrap();
let extracted_value: i8 = bit_reader.read_i8(7).unwrap();
Your question is pretty unclear, but I think you are just asking about normal bit manipulation. Mask the 7 bits (assuming the lower 7 bits, although you did not say) and convert the remaining bits to a signed number:
fn main() {
let byte = 0xFFu8;
let byte2 = (byte & 0b0111_1111) as i8;
println!("{}", byte2);
}
If you want to turn an array of u8 into a vector of i8 while ignoring the most significant bit, you can do it in the following manner:
fn main() {
let array_unsigned = [1u8, 2, 3]; // this will work for 64 values too
let vec_signed: Vec<i8> = array_unsigned.into_iter()
.map(|&e| if e <= 127 { e as i8 } else { (e - 128) as i8 }).collect();
println!("{:?}", vec_signed);
}
This way consumes the input array. It could probably be done in a nicer way with some bit-fiddling.

Resources