Best way to pad a vector with zero bytes? - rust

I need to send some messages (as vectors), and they need to be sent as the same length. I want to pad the vectors with zeros if they are not the right length.
Let's say that we need all payloads to be of length 100. I thought I could use the extendfunction and do something like this:
let len = 100;
let mut msg = &[1,23,34].to_vec();
let diff : usize = msg.len()-len;
let padding = [0;diff].to_vec();
msg.extend(padding);
This won't compile though, since the compiler complains that diff is not a constant. But also this seems rather verbose to do this simple thing that we are trying to.
Is there a nice and concise way to do this?

You can fix your code with std::iter::repeat():
let len = 100;
let mut msg = vec![1, 23, 34];
let diff = len - msg.len();
msg.extend(std::iter::repeat(0).take(diff));
But there is much better way - Vec::resize() is what you need:
let len = 100;
let mut msg = vec![1, 23, 34];
msg.resize(len, 0);

Related

How can I read at most N bytes from a `Read` instance? [duplicate]

This question already has answers here:
What is the correct way to read a binary file in chunks of a fixed size and store all of those chunks into a Vec?
(2 answers)
Closed last month.
This post was edited and submitted for review last month and failed to reopen the post:
Original close reason(s) were not resolved
I have a Read instance (in this case, a file). I want to read at most some number of bytes N, meaning that if the file has length greater than N I read exactly N bytes, but if the file length is less than N, I read to the end of the file.
I can't use read_exact, because that might return UnexpectedEof, which means I dont't know what size to truncate the buffer to. I also don't want to just use a single read call, since that is OS-dependent and may read less than N.
I tried writing this, using Read::take:
const N: usize = 4096;
// Pretend this is a 20 byte file
let bytes = vec![3; 20];
let read = std::io::Cursor::new(&bytes);
let mut buf = vec![0; N];
let n = read.take(N as u64).read_to_end(&mut buf).unwrap();
buf.truncate(n);
assert_eq!(buf, bytes);
I would expect buf to be equal to bytes after the read_to_end call, but the assertion fails because buf ends up being only zeroes. The buffer does end up being the correct length, however.
read_to_end() expects a empty vector, you are providing it with one which is full with zeros. To fix your issue, rewrite your code using Vec::with_capacity which preallocates but does not fill the vector.
const N: usize = 4096;
let bytes = vec![3; 20];
let read = std::io::Cursor::new(&bytes);
// Use vec::with_capacity() to allocate without filling the vec
let mut buf = Vec::with_capacity(N);
let n = read.take(N as u64).read_to_end(&mut buf).unwrap();
buf.truncate(n);
assert_eq!(buf, bytes);
You should use std::io::Read::read:
use std::io::Read;
fn main() {
const N: usize = 4096;
// Pretend this is a 20 byte file
let bytes = vec![3; 20];
let mut read = std::io::Cursor::new(&bytes);
let mut buf = vec![0; N];
let n = read.read(&mut buf).unwrap();
buf.truncate (n);
assert_eq!(buf, bytes);
}
Playground
Note however that this may read less than even the full file size, although in practice this shouldn't be an issue for small files since file transfers are done in blocks, usually 4kB, and interruptions only happen on block boundaries, so any file smaller than a block should be read entirely.

Parse integer from from iterator

Is there a way to parse an integer from a str in rust, like my_str.parse(), but yielding an iterator to the point after the parsed integer? Something like this:
let my_str = "1234x";
let mut iter = my_str.chars();
assert_eq!(iter.parse_and_advance().unwrap(), 1234);
assert_eq!(iter.next(), Some('x'));
You don't need iterators at all. You can first use str::find to find the first non-numeric value, then use str::split_at to split the string there so you can parse the first half and convert the second half into an iterator:
let str = "1234x";
let non_letter = str.find(|c: char| c != '-' && !c.is_numeric());
let (num, rest) = str.split_at(non_letter.unwrap_or(str.len()));
let num: i32 = num.parse().unwrap();
let mut rest = rest.chars();
assert_eq!(num, 1234);
assert_eq!(rest.next(), Some('x'));
Playground link
Note that, as stated in the comments, there's a little more nuance than this to extracting the initial number, but depending on your usecase it won't be an issue.

The size for values of type `[char]` cannot be known at compilation time

I have the following Rust code ...
const BINARY_SIZE: usize = 5;
let mut all_bits: Vec<[char; BINARY_SIZE]> = Vec::new();
let mut one_bits: [char; BINARY_SIZE] = ['0'; BINARY_SIZE];
all_bits.push(one_bits);
for i in [0..BINARY_SIZE] {
let one = all_bits[0];
let first_ok = one[0]; // This works, first_ok is '0'
let first_fail = one[i]; // This works not
}
How can I get from the variable 'one' the i'th character from the array?
The compiler gives me for let first_fail = one[i]; the error message ..
error[E0277]: the size for values of type [char] cannot be known at compilation time
Your problem is that you're using the Range syntax incorrectly. By wrapping 0..BINARY_SIZE in brackets, you're iterating over the elements in a slice of Ranges, rather than iterating over the values within the range you specified.
This means that i is of type Range rather than type usize. You can prove this by adding let i: usize = i; at the top of the loop. And indexing with a range returns a slice, rather than an element of your array.
Try removing the brackets like so:
const BINARY_SIZE: usize = 5;
let mut all_bits: Vec<[char; BINARY_SIZE]> = Vec::new();
let mut one_bits: [char; BINARY_SIZE] = ['0'; BINARY_SIZE];
all_bits.push(one_bits);
for i in 0..BINARY_SIZE {
let one = all_bits[0];
let first_ok = one[0]; // This works, first_ok is '0'
let first_fail = one[i]; // This works now
}
The error here really doesn't help much. But if you were using a helpful editor integration like rust-analyzer, you would see an inlay type hint showing i: Range.
Perhaps the rust compiler error message here can be improved to trace back through the index type.

Difference between double quotes and single quotes in Rust

I was doing the adventofcode of 2020 day 3 in Rust to train a little bit because I am new to Rust and I my code would not compile depending if I used single quotes or double quotes on my "tree" variable
the first code snippet would not compile and throw the error: expected u8, found &[u8; 1]
use std::fs;
fn main() {
let text: String = fs::read_to_string("./data/text").unwrap();
let vec: Vec<&str> = text.lines().collect();
let vec_vertical_len = vec.len();
let vec_horizontal_len = vec[0].len();
let mut i_pointer: usize = 0;
let mut j_pointer: usize = 0;
let mut tree_counter: usize = 0;
let tree = b"#";
loop {
i_pointer += 3;
j_pointer += 1;
if j_pointer >= vec_vertical_len {
break;
}
let i_index = i_pointer % vec_horizontal_len;
let character = vec[j_pointer].as_bytes()[i_index];
if character == tree {
tree_counter += 1
}
}
println!("{}", tree_counter);
}
the second snippet compiles and gives the right answer..
use std::fs;
fn main() {
let text: String = fs::read_to_string("./data/text").unwrap();
let vec: Vec<&str> = text.lines().collect();
let vec_vertical_len = vec.len();
let vec_horizontal_len = vec[0].len();
let mut i_pointer: usize = 0;
let mut j_pointer: usize = 0;
let mut tree_counter: usize = 0;
let tree = b'#';
loop {
i_pointer += 3;
j_pointer += 1;
if j_pointer >= vec_vertical_len {
break;
}
let i_index = i_pointer % vec_horizontal_len;
let character = vec[j_pointer].as_bytes()[i_index];
if character == tree {
tree_counter += 1
}
}
println!("{}", tree_counter);
}
I did not find any reference explaining what is going on when using single or double quotes..can someone help me?
The short answer is it works similarly to java. Single quotes for characters and double quotes for strings.
let a: char = 'k';
let b: &'static str = "k";
The b'' or b"" prefix means take what I have here and interpret as byte literals instead.
let a: u8 = b'k';
let b: &'static [u8; 1] = b"k";
The reason strings result in references is due to how they are stored in the compiled binary. It would be too inefficient to store a string constant inside each method, so strings get put at the beginning of the binary in header area. When your program is being executed, you are taking a reference to the bytes in that header (hence the static lifetime).
Going further down the rabbit hole, single quotes technically hold a codepoint. This is essentially what you might think of as a character. So a Unicode character would also be considered a single codepoint even though it may be multiple bytes long. A codepoint is assumed to fit into a u32 or less so you can safely convert any char by using as u32, but not the other way around since not all u32 values will match valid codepoints. This also means b'\u{x}' is not valid since \u{x} may produce characters that will not fit within a single byte.
// U+1F600 is a unicode smiley face
let a: char = '\u{1F600}';
assert_eq!(a as u32, 0x1F600);
However, you might find it interesting to know that since Rust strings are stored as UTF-8, codepoints over 127 will occupy multiple bytes in a string despite fitting into a single byte on their own. As you may already know, UTF-8 is simply a way of converting codepoints to bytes and back again.
let foo: &'static str = "\u{1F600}";
let foo_chars: Vec<char> = foo.chars().collect();
let foo_bytes: Vec<u8> = foo.bytes().collect();
assert_eq!(foo_chars.len(), 1);
assert_eq!(foo_bytes.len(), 4);
assert_eq!(foo_chars[0] as u32, 0x1F600);
assert_eq!(foo_bytes, vec![240, 159, 152, 128]);

How to make a vector of received size?

I have a vector data with size unknown at compile time. I want to create a new vector of the exact that size. These variants don't work:
let size = data.len();
let mut try1: Vec<u32> = vec![0 .. size]; //ah, you need compile-time constant
let mut try2: Vec<u32> = Vec::new(size); //ah, there is no constructors with arguments
I'm a bit frustrated - there is no any information in Rust API, book, reference or rustbyexample.com about how to do such simple base task with vector.
This solution works but I don't think it is good to do so, it is strange to generate elements one by one and I don't have need in any exact values of elements:
let mut temp: Vec<u32> = range(0u32, data.len() as u32).collect();
The recommended way of doing this is in fact to form an iterator and collect it to a vector. What you want is not precisely clear, however; if you want [0, 1, 2, …, size - 1], you would create a range and collect it to a vector:
let x = (0..size).collect::<Vec<_>>();
(range(0, size) is better written (0..size) now; the range function will be disappearing from the prelude soon.)
If you wish a vector of zeroes, you would instead write it thus:
let x = std::iter::repeat(0).take(size).collect::<Vec<_>>();
If you merely want to preallocate the appropriate amount of space but not push values onto the vector, Vec::with_capacity(capacity) is what you want.
You should also consider whether you need it to be a vector or whether you can work directly with the iterator.
You can use Vec::with_capacity() constructor followed by an unsafe set_len() call:
let n = 128;
let v: Vec<u32> = Vec::with_capacity(n);
unsafe { v.set_len(n); }
v[12] = 64; // won't panic
This way the vector will "extend" over the uninitialized memory. If you're going to use it as a buffer it is a valid approach, as long as the type of elements is Copy (primitives are ok, but it will break horribly if the type has a destructor).

Resources