i want to convert u32 into ASCII bytes.
input: 1u32
output [49]
This was my try, but its empty with 0u32 and also using Vec, i would prefer ArrayVec but how do i know the size of the number. Is there any simple way to do this , without using any dynamic allocations?
let mut num = 1u32;
let base = 10u32;
let mut a: Vec<char> = Vec::new();
while num != 0 {
let chars = char::from_digit(num % base,10u32).unwrap();
a.push(chars);
num /= base;
}
let mut vec_of_u8s: Vec<u8> = a.iter().map(|c| *c as u8).collect();
vec_of_u8s.reverse();
println!("{:?}",vec_of_u8s)
Use the write! macro and ArrayVec with the capacity set to 10 (the maximum digits of a u32):
use std::io::Write;
use arrayvec::ArrayVec; // 0.7.2
fn main() {
let input = 1u32;
let mut buffer = ArrayVec::<u8, 10>::new();
write!(buffer, "{}", input).unwrap();
dbg!(buffer);
}
[src/main.rs:10] buffer = [
49,
]
Related
I'm trying to format an u64 to a &str, but without dynamically allocating any memory on heap. I want to manually declare a space on stack (e.g., let mut buffer = [0u8; 20] and print the u64 to buffer and get a &str from it with some unsafe.
I tryied write!(&mut buffer[..], "{}", i), but it returns a Result<()> and I couldn't get the length of the formatted string so as to unsafely convert it to &str.
I'm currently straightly coping the implementation of Display for u64 from std library, is there a better way of doing so?
You could use a Cursor:
use std::io::{Write, Cursor};
fn main() {
let mut cursor = Cursor::new([0u8; 20]);
let i = 42u64;
write!(cursor, "{i}").unwrap();
let pos = cursor.position();
let buffer = cursor.into_inner();
let text = std::str::from_utf8(&buffer[..pos as usize]).unwrap();
println!("{text}");
}
I was doing the adventofcode of 2020 day 3 in Rust to train a little bit because I am new to Rust and I my code would not compile depending if I used single quotes or double quotes on my "tree" variable
the first code snippet would not compile and throw the error: expected u8, found &[u8; 1]
use std::fs;
fn main() {
let text: String = fs::read_to_string("./data/text").unwrap();
let vec: Vec<&str> = text.lines().collect();
let vec_vertical_len = vec.len();
let vec_horizontal_len = vec[0].len();
let mut i_pointer: usize = 0;
let mut j_pointer: usize = 0;
let mut tree_counter: usize = 0;
let tree = b"#";
loop {
i_pointer += 3;
j_pointer += 1;
if j_pointer >= vec_vertical_len {
break;
}
let i_index = i_pointer % vec_horizontal_len;
let character = vec[j_pointer].as_bytes()[i_index];
if character == tree {
tree_counter += 1
}
}
println!("{}", tree_counter);
}
the second snippet compiles and gives the right answer..
use std::fs;
fn main() {
let text: String = fs::read_to_string("./data/text").unwrap();
let vec: Vec<&str> = text.lines().collect();
let vec_vertical_len = vec.len();
let vec_horizontal_len = vec[0].len();
let mut i_pointer: usize = 0;
let mut j_pointer: usize = 0;
let mut tree_counter: usize = 0;
let tree = b'#';
loop {
i_pointer += 3;
j_pointer += 1;
if j_pointer >= vec_vertical_len {
break;
}
let i_index = i_pointer % vec_horizontal_len;
let character = vec[j_pointer].as_bytes()[i_index];
if character == tree {
tree_counter += 1
}
}
println!("{}", tree_counter);
}
I did not find any reference explaining what is going on when using single or double quotes..can someone help me?
The short answer is it works similarly to java. Single quotes for characters and double quotes for strings.
let a: char = 'k';
let b: &'static str = "k";
The b'' or b"" prefix means take what I have here and interpret as byte literals instead.
let a: u8 = b'k';
let b: &'static [u8; 1] = b"k";
The reason strings result in references is due to how they are stored in the compiled binary. It would be too inefficient to store a string constant inside each method, so strings get put at the beginning of the binary in header area. When your program is being executed, you are taking a reference to the bytes in that header (hence the static lifetime).
Going further down the rabbit hole, single quotes technically hold a codepoint. This is essentially what you might think of as a character. So a Unicode character would also be considered a single codepoint even though it may be multiple bytes long. A codepoint is assumed to fit into a u32 or less so you can safely convert any char by using as u32, but not the other way around since not all u32 values will match valid codepoints. This also means b'\u{x}' is not valid since \u{x} may produce characters that will not fit within a single byte.
// U+1F600 is a unicode smiley face
let a: char = '\u{1F600}';
assert_eq!(a as u32, 0x1F600);
However, you might find it interesting to know that since Rust strings are stored as UTF-8, codepoints over 127 will occupy multiple bytes in a string despite fitting into a single byte on their own. As you may already know, UTF-8 is simply a way of converting codepoints to bytes and back again.
let foo: &'static str = "\u{1F600}";
let foo_chars: Vec<char> = foo.chars().collect();
let foo_bytes: Vec<u8> = foo.bytes().collect();
assert_eq!(foo_chars.len(), 1);
assert_eq!(foo_bytes.len(), 4);
assert_eq!(foo_chars[0] as u32, 0x1F600);
assert_eq!(foo_bytes, vec![240, 159, 152, 128]);
I'm doing some computational mathematics in Rust, and I have some large numbers which I store in an array of 24 values. I have functions that convert them to bytes and back, but it doesn't work fine for u32 values, whereas it works fine for u64. The code sample can be found below:
fn main() {
let mut bytes = [0u8; 96]; // since u32 is 4 bytes in my system, 4*24 = 96
let mut j;
let mut k: u32;
let mut num: [u32; 24] = [1335565270, 4203813549, 2020505583, 2839365494, 2315860270, 442833049, 1854500981, 2254414916, 4192631541, 2072826612, 1479410393, 718887683, 1421359821, 733943433, 4073545728, 4141847560, 1761299410, 3068851576, 1582484065, 1882676300, 1565750229, 4185060747, 1883946895, 4146];
println!("original_num: {:?}", num);
for i in 0..96 {
j = i / 4;
k = (i % 4) as u32;
bytes[i as usize] = (num[j as usize] >> (4 * k)) as u8;
}
println!("num_to_ytes: {:?}", &bytes[..]);
num = [0u32; 24];
for i in 0..96 {
j = i / 4;
k = (i % 4) as u32;
num[j as usize] |= (bytes[i as usize] as u32) << (4 * k);
}
println!("recovered_num: {:?}", num);
}
Rust playground
The above code does not retrieve the correct number from the byte array. But, if I change all u32 to u64, all 4s to 8s, and reduce the size of num from 24 values to 12, it works all fine. I assume I have some logical problem for the u32 version. The correctly working u64 version can be found in this Rust playground.
Learning how to create a MCVE is a crucial skill when programming. For example, why do you have an array at all? Why do you reuse variables?
Your original first number is 0x4F9B1BD6, the output first number is 0x000B1BD6.
Comparing the intermediate bytes shows that you have garbage:
let num = 0x4F9B1BD6_u32;
println!("{:08X}", num);
let mut bytes = [0u8; BYTES_PER_U32];
for i in 0..bytes.len() {
let k = (i % BYTES_PER_U32) as u32;
bytes[i] = (num >> (4 * k)) as u8;
}
for b in &bytes {
print!("{:X}", b);
}
println!();
4F9B1BD6
D6BD1BB1
Printing out the values of k:
for i in 0..bytes.len() {
let k = (i % BYTES_PER_U32) as u32;
println!("{} / {}", k, 4 * k);
bytes[i] = (num >> (4 * k)) as u8;
}
Shows that you are trying to shift by multiples of 4 bits:
0 / 0
1 / 4
2 / 8
3 / 12
I'm pretty sure that every common platform today uses 8 bits for a byte, not 4.
This is why magic numbers are bad. If you had used constants for the values, you would have noticed the problem much sooner.
since u32 is 4 bytes in my system
A u32 better be 4 bytes on every system — that's why it's a u32.
Overall, don't reinvent the wheel. Use the byteorder crate or equivalent:
extern crate byteorder;
use byteorder::{BigEndian, ReadBytesExt, WriteBytesExt};
const LENGTH: usize = 24;
const BYTES_PER_U32: usize = 4;
fn main() {
let num: [u32; LENGTH] = [
1335565270, 4203813549, 2020505583, 2839365494, 2315860270, 442833049, 1854500981,
2254414916, 4192631541, 2072826612, 1479410393, 718887683, 1421359821, 733943433,
4073545728, 4141847560, 1761299410, 3068851576, 1582484065, 1882676300, 1565750229,
4185060747, 1883946895, 4146,
];
println!("original_num: {:?}", num);
let mut bytes = [0u8; LENGTH * BYTES_PER_U32];
{
let mut bytes = &mut bytes[..];
for &n in &num {
bytes.write_u32::<BigEndian>(n).unwrap();
}
}
let mut num = [0u32; LENGTH];
{
let mut bytes = &bytes[..];
for n in &mut num {
*n = bytes.read_u32::<BigEndian>().unwrap();
}
}
println!("recovered_num: {:?}", num);
}
I tried using LAPACK bindings for Rust when I came over some syntax that I could not find anything about.
The example code from https://github.com/stainless-steel/lapack:
let n = 3;
let mut a = vec![3.0, 1.0, 1.0, 1.0, 3.0, 1.0, 1.0, 1.0, 3.0];
let mut w = vec![0.0; n];
let mut work = vec![0.0; 4 * n];
let lwork = 4 * n as isize;
let mut info = 0;
lapack::dsyev(b'V', b'U', n, &mut a, n, &mut w, &mut work, lwork, &mut info);
for (one, another) in w.iter().zip(&[2.0, 2.0, 5.0]) {
assert!((one - another).abs() < 1e-14);
}
What does b'V' and b'U' mean?
b'A' means to create a byte literal. Specifically, it will be a u8 containing the ASCII value of the character:
fn main() {
let what = b'a';
println!("{}", what);
// let () = what;
}
The commented line shows you how to find the type.
b"hello" is similar, but produces a reference to an array of u8, a byte string:
fn main() {
let what = b"hello";
println!("{:?}", what);
// let () = what;
}
Things like this are documented in the Syntax Index which is currently only available in the nightly version of the docs.
It creates a u8 value with the ASCII value of the char between quote.
For ASCII literals, it's the same as writing 'V' as u8.
Also, the b prefix on a double quoted string will create a byte array containing the UTF8 content of the string.
let s: &[u8; 11] = b"Hello world";
I am trying to convert long numbers to a string vector. For example, 17562 would become ["1", "7", "5", "6", "2"]. I have seen a lot of examples of converting ints to strings, but no ints to string vectors. I want to iterate over each digit individually.
Here is what I have so far, but it isn't working.
fn main() {
let x = 42;
let values: Vec<&str> = x.to_string().split(|c: char| c.is_alphabetic()).collect();
println!("{:?}", values);
}
Gives me the compiler error of :
<anon>:3:29: 3:42 error: borrowed value does not live long enough
<anon>:3 let values: Vec<&str> = x.to_string().split(|c: char| c.is_alphabetic()).collect();
<anon>:3:88: 6:2 note: reference must be valid for the block suffix following statement 1 at 3:87...
<anon>:3 let values: Vec<&str> = x.to_string().split(|c: char| c.is_alphabetic()).collect();
<anon>:4 println!("{:?}", values);
<anon>:5
<anon>:6 }
<anon>:3:5: 3:88 note: ...but borrowed value is only valid for the statement at 3:4
<anon>:3 let values: Vec<&str> = x.to_string().split(|c: char| c.is_alphabetic()).collect();
<anon>:3:5: 3:88 help: consider using a `let` binding to increase its lifetime
<anon>:3 let values: Vec<&str> = x.to_string().split(|c: char| c.is_alphabetic()).collect();
The equivalent of what I am trying to do in python would be x = 42; x = list(str(x)); print(x)
Ok, the first problem is that you don't store the result of x.to_string() anywhere. As such, it will cease to exist at the end of the expression, meaning that values will be trying to reference a value that no longer exists. Hence the error. The simplest solution is to just store the temporary string somewhere so that it continues to exist:
fn main() {
let x = 42;
let x_str = x.to_string();
let values: Vec<&str> = x_str.split(|c: char| c.is_alphabetic()).collect();
println!("{:?}", values);
}
Second problem: this outputs ["42"] because you told it to split on letters. You probably meant to use is_numeric:
fn main() {
let x = 42;
let x_str = x.to_string();
let values: Vec<&str> = x_str.split(|c: char| c.is_numeric()).collect();
println!("{:?}", values);
}
Third problem: this outputs ["", "", ""], because those are the three strings between numeric characters. Split's argument is the separator. Thus, the third problem is that you're using entirely the wrong method to begin with.
The closest direct equivalent to the Python code you listed would be:
fn main() {
let x = 42;
let values: Vec<String> = x.to_string().chars().map(|c| c.to_string()).collect();
println!("{:?}", values);
}
At last, it outputs: ["4", "2"].
But, this is horribly inefficient: this takes the integer, allocates an intermediate buffer, prints the integer to it, turns it into a string. It takes each code point in that string, allocates an intermediate buffer, prints the code point to it, turns it into a string. Then it collects all these strings into a Vec, possibly reallocating more than once.
It works, but is a bit wasteful. If you don't care about waste, you can stop reading now.
You can make things a bit less wasteful by collecting code points instead of strings:
fn main() {
let x = 42;
let values: Vec<char> = x.to_string().chars().collect();
println!("{:?}", values);
}
This outputs: ['4', '2']. Note the different quotes because we're using char instead of String.
We can remove the intermediate allocations from Vec resizing by pre-allocating its storage, which gives us this version:
fn main() {
let x = 42u32; // no negatives!
let values = {
if x == 0 {
vec!['0']
} else {
// pre-allocate Vec so there's no resizing
let digits = 1 + (x as f64).log10() as u32;
let mut cs = Vec::with_capacity(digits as usize);
let mut div = 10u32.pow(digits - 1);
while div > 0 {
cs.push((b'0' + ((x / div) % 10) as u8) as char);
div /= 10;
}
cs
}
};
println!("{:?}", values);
}
Unless you're doing this in a loop, I'd just stick to the correct, wasteful version.
If you are looking for a performant version, I'd just use this
fn digits(mut val: u64) -> Vec<u8> {
// An unsigned 64-bit number can have 20 digits
let mut result = Vec::with_capacity(20);
loop {
let digit = val % 10;
val = val / 10;
result.push(digit as u8);
if val == 0 { break }
}
result.reverse();
result
}
fn main() {
println!("{:?}", digits(0));
println!("{:?}", digits(1));
println!("{:?}", digits(9));
println!("{:?}", digits(10));
println!("{:?}", digits(11));
println!("{:?}", digits(1234567890));
println!("{:?}", digits(0xFFFFFFFFFFFFFFFF));
}
This may over allocate by a few bytes, but 20 bytes total is small unless you are doing this a whole bunch. It also leaves each value as a number, which you can convert to a string as needed.
What about:
let ss = value.to_string()
.chars()
.map(|c| c.to_string())
.collect::<Vec<_>>();
Demo
Not the greatest perf but reads well.