How to convert a bool array into a byte array - rust

How do I convert a bool array into a byte array?
I would like the resulting vector to have 1/8th the length of the input vector.
Like this
let a = [false; 160];
let b: [u8; 20] = ???
I guess I can make a for-loop and do some arithmetic, but I wonder if there is a simpler way of doing it.

As an alternative to the crate-based solutions, the "hand-rolled" version isn't too difficult to implement, and avoids the allocations:
let mut b = [0u8;20];
for (idx, bit) in a.into_iter().enumerate() {
let byte = idx / 8;
let shift = 7 - idx % 8;
b[byte] |= (bit as u8) << shift;
}
The issue making this generic is that the current const generics don't support arithmetics on constants.
In nightly with generic_const_expr, it's possible for the thing to work on arbitrary input sizes (this incomplete implementation will panic on incomplete trailing bytes):
#![feature(generic_const_exprs)]
fn to_bits<const N: usize>(bools: [bool;N]) -> [u8;N/8] {
let mut out = [0;N/8];
for (idx, bit) in bools.into_iter().enumerate() {
let byte = idx / 8;
let shift = 7 - idx % 8;
out[byte] |= (bit as u8) << shift;
}
out
}
fn main() {
let mut a = [false; 160];
a[42] = true;
let b = to_bits(a);
println!("{:?}", b);
}
though I think bitvec also has a BitArray type which might allow doing this without allocations as well.

This works and doesn't require a new dependency.
let mut b = [0u8;20];
for i in 0..160 {
b[i / 8] |= u8::from(a[i]) << (i % 8);
}

Using the bitvec crate, something like this should come close:
use bitvec::vec::BitVec;
let a = [false; 160];
let b = [0; 20]
b[..].copy_from_slice (a.iter().collect::<BitVec>().as_raw_slice());

I don't think there is a way to do that with just the built-in std library.
With the excellent bit-vec crate, though:
use bit_vec::BitVec;
fn main() {
let mut a = [false; 160];
a[42] = true;
let bitvec: BitVec = a.into_iter().collect();
let b: [u8; 20] = bitvec.to_bytes().try_into().unwrap();
println!("{:?}", b);
}
[0, 0, 0, 0, 0, 32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
You can put it inside a generic function, but that requires nightly and unstable features:
#![feature(generic_const_exprs)]
use bit_vec::BitVec;
fn convert_vector<const N: usize>(bits: [bool; N]) -> [u8; N / 8] {
let bitvec: BitVec = bits.into_iter().collect();
bitvec.to_bytes().try_into().unwrap()
}
fn main() {
let mut a = [false; 160];
a[42] = true;
println!("{:?}", convert_vector(a));
}

Related

Convert a vector of u8 bytes into a rust_decimal

I am loading data from another language. Numbers can be very large and they are serialized as a byte array of u8s.
These are loaded into rust as a vec of u8s:
vec![1, 0, 0]
This represents 100. I also have a u32 to represent the cale.
I'm trying to load this into a rust_decimal, but am stuck.
measure_value.value -> a vec of u8
measure_value.scale -> a u32
let r_dec = rust_Decimal::????
This is the implementation I have so far, but it feels inelegant!
pub fn proto_to_decimal(input: &DecimalValueProto) -> Result<Decimal, String> {
let mut num = 0;
let mut power: i32 = (input.value.len() - 1)
.try_into()
.map_err(|_| "Failed to convert proto to decimal")?; //casting down from usize to i32 is failable
for digit in input.value.iter() {
let expansion: i128 = if power == 0 { expansion = *digit as i128 } else { expansion = (*digit as i128) * 10_i128.pow(power as u32) as i128 }
num += expansion;
power -= 1;
}
Ok(Decimal::from_i128_with_scale(num as i128, input.scale))
}

How to let several threads write to the same variable without mutex in Rust?

I am trying to implement an outer function that could calculate the outer product of two 1D arrays. Something like this:
use std::thread;
use ndarray::prelude::*;
pub fn multithread_outer(A: &Array1<f64>, B: &Array1<f64>) -> Array2<f64> {
let mut result = Array2::<f64>::default((A.len(), B.len()));
let thread_num = 5;
let n = A.len() / thread_num;
// a & b are ArcArray2<f64>
let a = A.to_owned().into_shared();
let b = B.to_owned().into_shared();
for i in 0..thread_num{
let a = a.clone();
let b = b.clone();
thread::spawn(move || {
for j in i * n..(i + 1) * n {
for k in 0..b.len() {
// This is the line I want to change
result[[j, k]] = a[j] * b[k];
}
}
});
}
// Use join to make sure all threads finish here
// Not so related to this question, so I didn't put it here
result
}
You can see that by design, two threads will never write to the same element. However, rust compiler will not allow two mutable references to the same result variable. And using mutex will make this much slower. What is the right way to implement this function?
While it is possible to do manually (with thread::scope and split_at_mut, for example), ndarray already has parallel iteration integrated into its library, based on rayon:
https://docs.rs/ndarray/latest/ndarray/parallel
Here is how your code would look like with parallel iterators:
use ndarray::parallel::prelude::*;
use ndarray::prelude::*;
pub fn multithread_outer(a: &Array1<f64>, b: &Array1<f64>) -> Array2<f64> {
let mut result = Array2::<f64>::default((a.len(), b.len()));
result
.axis_iter_mut(Axis(0))
.into_par_iter()
.enumerate()
.for_each(|(row_id, mut row)| {
for (col_id, cell) in row.iter_mut().enumerate() {
*cell = a[row_id] * b[col_id];
}
});
result
}
fn main() {
let a = Array1::from_vec(vec![1., 2., 3.]);
let b = Array1::from_vec(vec![4., 5., 6., 7.]);
let c = multithread_outer(&a, &b);
println!("{}", c)
}
[[4, 5, 6, 7],
[8, 10, 12, 14],
[12, 15, 18, 21]]

Calculate UDP Checksum in Rust?

I am trying to manually build packets and am having trouble calculating a correct UDP checksum. Can someone tell me what I am doing wrong in the below code? The packet being passed in is the complete packet to be sent with a placeholder for the UDP Checksum currently of 0x0000, but I sum the psuedoheader, udp header, and udp payload, but according to wireshark my UDP checksums are incorrect. (Mine: 0x9f4c vs Wireshark: 0x2b7b for example)
fn udp_checksum (packet: &Vec<u8>) -> [u8; 2] {
let mut idx = 0;
let mut idx_end = 2;
let mut payload = &packet[42..];
let payload_len = payload.len();
if payload_len % 2 != 0 {
payload.to_vec().push(0);
}
let source_ip_1 = BigEndian::read_u16(&packet[26..28]); //source ip 1 of 2
let source_ip_2 = BigEndian::read_u16(&packet[28..30]); //source ip 2 of 2
let dest_ip_1 = BigEndian::read_u16(&packet[30..32]); //dest ip 1 of 2
let dest_ip_2 = BigEndian::read_u16(&packet[32..34]); //dest ip 2 of 2
let udp_len = BigEndian::read_u16(&packet[38..40]);
let source_port = BigEndian::read_u16(&packet[34..36]);
let dest_port = BigEndian::read_u16(&packet[36..38]);
let mut header_sum = UDP_PROTO as u32 + source_ip_1 as u32 + source_ip_2 as u32 + dest_ip_1 as u32 + dest_ip_2 as u32 + udp_len as u32 + source_port as u32 + dest_port as u32 + udp_len as u32;
// println!("Payload Len: {:?}", &payload.len());
// println!("Payload: {:?}", &payload);
// println!("First Payload Slice: {:?}", &payload[idx..idx_end]);
// println!("First BE U32: {:?}", BigEndian::read_u16(&payload[idx..idx_end]) as u32);
while idx < &payload.len() - 2 {
header_sum += BigEndian::read_u16(&payload[idx..idx_end]) as u32;
println!("Header Sum: {:0x?}", &header_sum);
idx += 2;
idx_end += 2;
}
while header_sum > 0xffff {
header_sum -= 0xffff;
header_sum += 1;
}
let udp_csum = 0xffff - (header_sum as u16);
let csum_one: u8 = header_sum as u8;
let csum_two: u8 = (header_sum >> 8) as u8;
println!("Calculated CSUM: {:?}", udp_csum);
println!("Checksum: {:0x}{:0x}", csum_one, csum_two);
return [csum_one, csum_two];
}```
It's maybe only a partial solution to the problem.
payload.to_vec() creates a new vector.
Extending it has no influence on payload.
Making payload mutable will enable working on the extended vector if necessary.
Here is a minimal example.
fn main() {
let v1 = vec![9, 1, 2, 3];
let mut v2 = Vec::new(); // empty for now
let mut sl = &v1[1..]; // could be reassigned to v2
println!("before v1: {:?}", v1);
println!("before v2: {:?}", v2);
println!("before sl: {:?}", sl);
if sl.len() % 2 != 0 {
v2 = sl.to_vec();
v2.push(0);
sl = &v2[..];
}
println!("after v1: {:?}", v1);
println!("after v2: {:?}", v2);
println!("after sl: {:?}", sl);
}
/*
before v1: [9, 1, 2, 3]
before v2: []
before sl: [1, 2, 3]
after v1: [9, 1, 2, 3]
after v2: [1, 2, 3, 0]
after sl: [1, 2, 3, 0]
*/
Another solution, avoiding the copy, would be to stop one byte earlier (if payload.len() is odd) the loop, then deal with the remaining byte.

Is there a way to shuffle two or more lists in the same order?

I want something like this pseudocode:
a = [1, 2, 3, 4];
b = [3, 4, 5, 6];
iter = a.iter_mut().zip(b.iter_mut());
shuffle(iter);
// example shuffle:
// a = [2, 4, 3, 1];
// b = [4, 6, 5, 3];
More specifically, is there some function which performs like:
fn shuffle<T>(iterator: IterMut<T>) { /* ... */ }
My specific case is trying to shuffle an Array2 by rows and a vector (array2:Lndarray:Array2<f32>, vec:Vec<usize>).
Specifically array2.iter_axis(Axis(1)).zip(vec.iter()).
Shuffling a generic iterator in-place is not possible.
However, it's pretty easy to implement shuffling for a slice:
use rand::Rng;
pub fn shufflex<T: Copy>(slice: &mut [T]) {
let mut rng = rand::thread_rng();
let len = slice.len();
for i in 0..len {
let next = rng.gen_range(i, len);
let tmp = slice[i];
slice[i] = slice[next];
slice[next] = tmp;
}
}
But it's also possible to write a more general shuffle function that works on many types:
use std::ops::{Index, IndexMut};
use rand::Rng;
pub fn shuffle<T>(indexable: &mut T)
where
T: IndexMut<usize> + Len + ?Sized,
T::Output: Copy,
{
let mut rng = rand::thread_rng();
let len = indexable.len();
for i in 0..len {
let next = rng.gen_range(i, len);
let tmp = indexable[i];
indexable[i] = indexable[next];
indexable[next] = tmp;
}
}
I wrote a complete example that also allows shuffling across multiple slices in the playground.
EDIT: I think I misunderstood what you want to do. To shuffle several slices in the same way, I would do this:
use rand::Rng;
pub fn shuffle<T: Copy>(slices: &mut [&mut [T]]) {
if slices.len() > 0 {
let mut rng = rand::thread_rng();
let len = slices[0].len();
assert!(slices.iter().all(|s| s.len() == len));
for i in 0..len {
let next = rng.gen_range(i, len);
for slice in slices.iter_mut() {
let tmp: T = slice[i];
slice[i] = slice[next];
slice[next] = tmp;
}
}
}
}
To shuffle in the same order, you can first remember the order and then reuse it for every shuffle. Starting with the Fisher-Yates shuffle from the rand crate:
fn shuffle<R>(&mut self, rng: &mut R)
where R: Rng + ?Sized {
for i in (1..self.len()).rev() {
self.swap(i, gen_index(rng, i + 1));
}
}
It turns out that we need to store random numbers between 0 and i + 1 for each i between 1 and the length of the slice, in reverse order:
// create a vector of indices for shuffling slices of given length
let indices: Vec<usize> = {
let mut rng = rand::thread_rng();
(1..slice_len).rev()
.map(|i| rng.gen_range(0, i + 1))
.collect()
};
Then we can implement a variant of shuffle where, instead of generating new random numbers, we pick them up from the above list of random indices:
// shuffle SLICE taking indices from the provided vector
for (i, &rnd_ind) in (1..slice.len()).rev().zip(&indices) {
slice.swap(i, rnd_ind);
}
Putting the two together, you can shuffle multiple slices in the same order using a method like this (playground):
pub fn shuffle<T>(slices: &mut [&mut [T]]) {
if slices.len() == 0 {
return;
}
let indices: Vec<usize> = {
let mut rng = rand::thread_rng();
(1..slices[0].len())
.rev()
.map(|i| rng.gen_range(0, i + 1))
.collect()
};
for slice in slices {
assert_eq!(slice.len(), indices.len() + 1);
for (i, &rnd_ind) in (1..slice.len()).rev().zip(&indices) {
slice.swap(i, rnd_ind);
}
}
}

Binary search a vector in chunks

I have a file of ipv4 addresses, which as we know are 4 bytes each. I wish to do a binary search over the file contents to find a given IP address. Rust has a built-in binary search but it doesn't let you pass a len and it instead reads it from the vector.
I have tried to adapt the built-in rust binary search but am a bit lost. This is where i am so far. Maybe there is a way to use the built in method?
fn binary_search(s: &Vec<&u8>, x: &u32) -> Result<usize, usize> {
let f = |p: &[u8]| p.cmp(x); // need to compare byte slices somehow
let mut size = s.len() / 4;
if size == 0 {
return Err(0);
}
let mut base = 0usize;
while size > 1 {
let half = size / 2;
let mid = base + half;
let cmp = f(s[mid..mid+4]);
base = if cmp == Greater { base } else { mid };
size -= half;
}
let cmp = f(s[base..base+4]);
if cmp == Equal {
Ok(base)
} else {
Err(base + (cmp == Less) as usize)
}
}
It’d be better to have a slice with one element per address, either of 4-byte arrays ([u8; 4]), some equivalent struct (hey, Ipv4Addr), or just u32. Unfortunately, I don’t think it’s possible to reinterpret a &[u8] with a length divisible by 4 as &[[u8; 4]] yet (and the other options would need alignment). You could do this conversion while reading the file in chunks, though.
So first, in an equivalent example program:
use std::net::Ipv4Addr;
fn main() {
let vec: Vec<Ipv4Addr> = vec![
[10, 0, 0, 0].into(),
[20, 0, 0, 0].into(),
[30, 0, 0, 0].into(),
];
println!("vec {:?}", vec);
let found = vec.binary_search(&Ipv4Addr::from_str("20.0.0.0").unwrap());
println!("found {:?}", found);
}
(playground)
Then reading from a file would look something like:
let mut vec: Vec<Ipv4Addr> = vec![];
loop {
let mut address = [0; 4];
match f.read_exact(&mut address) {
Ok(()) => {},
Err(err) if err.kind() == ErrorKind::UnexpectedEof => break,
err => err?,
}
vec.push(address.into());
}
(although this one is slightly lax in that it ignores any trailing bytes that don’t form a multiple of 4)
where f is a BufReader around a file.
I think i have a working solution now, but i'm not a master at rust so please critique it harshly.
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=6e3102ea622f1ae0d66465f4007ccb03
use std::cmp::Ordering::{self, Equal, Greater, Less};
use std::net::{IpAddr, Ipv4Addr, Ipv6Addr};
use std::str::FromStr;
fn binary_search(s: Vec<u8>, x: Vec<u8>) -> Result<usize, usize> {
let f = |p: &[u8]| p.cmp(&x);
let mut size = s.len() / 4;
if size == 0 {
return Err(0);
}
let mut base = 0usize;
while size > 1 {
let half = size / 2;
let mid = base + half;
// mid is always in [0, size), that means mid is >= 0 and < size.
// mid >= 0: by definition
// mid < size: mid = size / 2 + size / 4 + size / 8 ...
let cmp = f(s[mid*4..(mid+1)*4].to_vec());
base = if cmp == Greater { base } else { mid };
size -= half;
}
// base is always in [0, size) because base <= mid.
let cmp = f(s[base*4..(base+1)*4].to_vec());
if cmp == Equal {
Ok(base*4)
} else {
Err(base*4 + ((cmp == Less) as usize) * 4)
}
}
fn main() {
let vec: Vec<u8> = vec![10, 0, 0, 0, 20, 0, 0, 0, 30, 0, 0, 0];
println!("vec {:?}", vec);
let found = binary_search(vec, Ipv4Addr::from_str("20.0.0.0").unwrap().octets().to_vec());
println!("found {:?}", found);
}

Resources