let hex = "100000000000000000".as_bytes().to_hex();
// hex == "313030303030303030303030303030303030"
println!("{:x}", 100000000000000000000000u64);
// literal out of range for u64
How can I got that value?
In Python, I would just call hex(100000000000000000000000) and I get '0x152d02c7e14af6800000'.
to_hex() comes from the hex crate.
One needs to be aware of the range of representable values for different numeric types in Rust. In this particular case, the value exceeds the limits of an u64, but the u128 type accommodates the value. The following code outputs the same value as the example in Python:
fn main() {
let my_string = "100000000000000000000000".to_string(); // `parse()` works with `&str` and `String`!
let my_int = my_string.parse::<u128>().unwrap();
let my_hex = format!("{:X}", my_int);
println!("{}", my_hex);
}
Checked with the Rust Playground:
152D02C7E14AF6800000
An explicit usage of arbitrary precision arithmetic is required in the general case. A few suggestions from What's the best crate for arbitrary precision arithmetic in Rust? on Reddit:
num_bigint works on stable and does not have unsafe code.
ramp uses unsafe and does not work on stable Rust, but it is faster.
rust-gmp and rug bind to the state-of-the-art bigint implementation in C (GMP). They are the fastest and have the most features. You probably want to use one of those.
Related
I read this docs and stumbled on this line.
let an_integer = 5i32; // Suffix annotation
What is this mean? I am assuming that it have 5 as value and i32 as integer type. Is that correct?
Yes, that's correct. When you write the literal 5 in a program, it could be interpreted as a variety of types. (A literal is a value, such as 5, which is written directly into the source code instead of being computed.) If we want to express that a literal is of a certain type, we can append ("suffix") the type to it to make it explicit, as in 5i32.
This is only done with certain built-in types, such as integers and floating-point numbers, but it can come in handy in some cases. For example, the following is not valid:
fn main() {
println!("{}", 1 << 32);
}
That's because if you specify no type at all for an integer, it defaults to i32. Since it's not valid to shift a 32-bit integer by 32 bits, Rust produces an error.
However, we can write this and it will work:
fn main() {
println!("{}", 1u64 << 32);
}
That's because now the integer is a u64 and it's in range.
The first part of the question is probably pretty common and there are enough code samples that explain how to generate a random string of alphanumerics. The piece of code I use is from here.
use rand::{thread_rng, Rng};
use rand::distributions::Alphanumeric;
fn main() {
let rand_string: String = thread_rng()
.sample_iter(&Alphanumeric)
.take(30)
.collect();
println!("{}", rand_string);
}
This piece of code does however not compile, (note: I'm on nightly):
error[E0277]: a value of type `String` cannot be built from an iterator over elements of type `u8`
--> src/main.rs:8:10
|
8 | .collect();
| ^^^^^^^ value of type `String` cannot be built from `std::iter::Iterator<Item=u8>`
|
= help: the trait `FromIterator<u8>` is not implemented for `String`
Ok, the elements that are generated are of type u8. So I guess this is an array or vector of u8:
use rand::{thread_rng, Rng};
use rand::distributions::Alphanumeric;
fn main() {
let r = thread_rng()
.sample_iter(&Alphanumeric)
.take(30)
.collect::<Vec<_>>();
let s = String::from_utf8_lossy(&r);
println!("{}", s);
}
And this compiles and works!
2dCsTqoNUR1f0EzRV60IiuHlaM4TfK
All good, except that I would like to ask if someone could explain what exactly happens regarding the types and how this can be optimised.
Questions
.sample_iter(&Alphanumeric) produces u8 and not chars?
How can I avoid the second variable s and directly interpret an u8 as a utf-8 character? I guess the representation in memory would not change at all?
The length of these strings should always be 30. How can I optimise the heap allocation of a Vec away? Also they could actually be char[] instead of Strings.
.sample_iter(&Alphanumeric) produces u8 and not chars?
Yes, this was changed in rand v0.8. You can see in the docs for 0.7.3:
impl Distribution<char> for Alphanumeric
But then in the docs for 0.8.0:
impl Distribution<u8> for Alphanumeric
How can I avoid the second variable s and directly interpret an u8 as a utf-8 character? I guess the representation in memory would not change at all?
There are a couple of ways to do this, the most obvious being to just cast every u8 to a char:
let s: String = thread_rng()
.sample_iter(&Alphanumeric)
.take(30)
.map(|x| x as char)
.collect();
Or, using the From<u8> instance of char:
let s: String = thread_rng()
.sample_iter(&Alphanumeric)
.take(30)
.map(char::from)
.collect();
Of course here, since you know every u8 must be valid UTF-8, you can use String::from_utf8_unchecked, which is faster than from_utf8_lossy (although probably around the same speed as the as char method):
let s = unsafe {
String::from_utf8_unchecked(
thread_rng()
.sample_iter(&Alphanumeric)
.take(30)
.collect::<Vec<_>>(),
)
};
If, for some reason, the unsafe bothers you and you want to stay safe, then you can use the slower String::from_utf8 and unwrap the Result so you get a panic instead of UB (even though the code should never panic or UB):
let s = String::from_utf8(
thread_rng()
.sample_iter(&Alphanumeric)
.take(30)
.collect::<Vec<_>>(),
).unwrap();
The length of these strings should always be 30. How can I optimise the heap allocation of a Vec away? Also they could actually be char[] instead of Strings.
First of all, trust me, you don't want arrays of chars. They are not fun to work with. If you want a stack string, have a u8 array then use a function like std::str::from_utf8 or the faster std::str::from_utf8_unchecked (again only usable since you know valid utf8 will be generated.)
As to optimizing the heap allocation away, refer to this answer. Basically, it's not possible with a bit of hackiness/ugliness (such as making your own function that collects an iterator into an array of 30 elements).
Once const generics are finally stabilized, there'll be a much prettier solution.
The first example in the docs for rand::distributions::Alphanumeric shows that if you want to convert the u8s into chars you should map them using the char::from function:
use rand::{thread_rng, Rng};
use rand::distributions::Alphanumeric;
fn main() {
let rand_string: String = thread_rng()
.sample_iter(&Alphanumeric)
.map(char::from) // map added here
.take(30)
.collect();
println!("{}", rand_string);
}
playground
I'd like to define a function that can return a number whose type is specified when the function is called. The function takes a buffer (Vec<u8>) and returns numeric value, e.g.
let byte = buf_to_num<u8>(&buf);
let integer = buf_to_num<u32>(&buf);
The buffer contains an ASCII string that represents a number, e.g. b"827", where each byte is the ASCII code of a digit.
This is my non-working code:
extern crate num;
use num::Integer;
use std::ops::{MulAssign, AddAssign};
fn buf_to_num<T: Integer + MulAssign + AddAssign>(buf: &Vec::<u8>) -> T {
let mut result : T;
for byte in buf {
result *= 10;
result += (byte - b'0');
}
result
}
I get mismatched type errors for both the addition and the multiplication lines (expected type T, found u32). So I guess my problem is how to tell the type system that T can be expressed in terms of a literal 10 or in terms of the result of (byte - b'0')?
Welcome to the joys of having to specify every single operation you're using as a generic. It's a pain, but it is worth.
You have two problems:
result *= 10; without a corresponding From<_> definition. This is because, when you specify "10", there is no way for the compiler to know what "10" as a T means - it knows primitive types, and any conversion you defined by implementing From<_> traits
You're mixing up two operations - coercion from a vector of characters to an integer, and your operation.
We need to make two assumptions for this:
We will require From<u32> so we can cap our numbers to u32
We will also clarify your logic and convert each u8 to char so we can use to_digit() to convert that to u32, before making use of From<u32> to get a T.
use std::ops::{MulAssign, AddAssign};
fn parse_to_i<T: From<u32> + MulAssign + AddAssign>(buf: &[u8]) -> T {
let mut buffer:T = (0 as u32).into();
for o in buf {
buffer *= 10.into();
buffer += (*o as char).to_digit(10).unwrap_or(0).into();
}
buffer
}
You can convince yourself of its behavior on the playground
The multiplication is resolved by force-casting the constant as u8, which makes it benefit from our requirement of From<u8> for T and allows the rust compiler to know we're not doing silly stuff.
The final change is to set result to have a default value of 0.
Let me know if this makes sense to you (or if it doesn't), and I'll be glad to elaborate further if there is a problem :-)
I'm working with an old Rust module that uses the extprim crate to provide a u128 type.
I'm trying to use this with a newer crate that uses Rust's primitive u128 type (available since Rust 1.26).
What's an efficient way to convert back and forth between these two types?
Update:
When your rustc version is greater than 1.26.0 the From trait is implemented and you can use into respectively from easily.
For a lower version than that see below.
As a note: "The most efficient way" is very subjective.
I would use the low64() and high64() methods to generate a rust u128.
extern crate extprim; // 1.6.0
use extprim::u128;
fn main() {
let number = u128::u128::from_parts(6_692_605_942, 14_083_847_773_837_265_618);
println!("{:?}", number);
// going forth
let real_number = u128::from(number.high64()) << 64 | u128::from(number.low64());
println!("{:?}", number);
assert_eq!(number.to_string(), real_number.to_string());
// and back
let old_number = u128::u128::from_parts((real_number >> 64) as u64, (real_number) as u64);
assert_eq!(number, old_number);
}
(playground)
Since you can't compare both directly, I used the to_string() function to convert them to a string and compare those.
What's the most straightforward way to convert a hex string into a float? (without using 3rd party crates).
Does Rust provide some equivalent to Python's struct.unpack('!f', bytes.fromhex('41973333'))
See this question for Python & Java, mentioning for reference.
This is quite easy without external crates:
fn main() {
// Hex string to 4-bytes, aka. u32
let bytes = u32::from_str_radix("41973333", 16).unwrap();
// Reinterpret 4-bytes as f32:
let float = unsafe { std::mem::transmute::<u32, f32>(bytes) };
// Print 18.9
println!("{}", float);
}
Playground link.
There's f32::from_bits which performs the transmute in safe code. Note that transmuting is not the same as struct.unpack, since struct.unpack lets you specify endianness and has a well-defined IEEE representation.