I'm trying to read 4 bytes from a socket and then transmute those bytes into a single u32.
let mut length: u32 = 0;
// Transmute 4 byte array into u32
unsafe {
let length = mem::transmute::<[u8; 4], u32>(buf); // buf is the 4 item array of u8 that the socket was read into
println!("1: {:}", length);
}
println!("Length: {:}", length);
However, length has the original value of 0 once outside of the unsafe block. How can I get around this?
In the inner block, you're not assigning a new value to the outer length binding, you are shadowing it by defining a new length binding.
I believe that in a simple code snippet such as this (where the outer mutable variable isn't reassigned) normally there should be a compiler warning along the lines of:
warning: variable does not need to be mutable
In any case, since you have declared an outer mutable binding, all you have to do to re-assign the original variable is drop the keyword let in the unsafe block.
Additionally, as pointed out in the comments:
There's no need to initialize this particular binding with a value, because that value is always going to be replaced before it is ever used.
If you aren't really re-assigning that variable later on (in code not shown in your question), then it doesn't even need to be mutable. Mutability doesn't force you to initialize the variable at the point of its declaration in the source code.
So this should suffice:
let length: u32; // or mut
unsafe {
length = mem::transmute::<[u8; 4], u32>(buf);
println!("1: {:}", length);
}
println!("Length: {:}", length);
Note (in addition to Theodoros' answer) that unsafe block is an expression, so you can do this:
let buf = [1u8, 2, 3, 4];
// Transmute 4 byte array into u32
let length = unsafe {
let length = std::mem::transmute::<[u8; 4], u32>(buf); // buf is the 4 item array of u8 that the socket was read into
println!("1: {:}", length);
length
};
println!("Length: {:}", length);
Or in short:
let length = unsafe {std::mem::transmute::<[u8; 4], u32>(buf)};
Related
I want to change a String into a vector of bytes and also modify its value, I have looked up and find How do I convert a string into a vector of bytes in rust?
but this can only get a reference and I cannot modify the vector. I want a to be 0, b to be 1 and so on, so after changing it into bytes I also need to subtract 97. Here is my attempt:
fn main() {
let s: String = "abcdefg".to_string();
let mut vec = (s.as_bytes()).clone();
println!("{:?}", vec);
for i in 0..vec.len() {
vec[i] -= 97;
}
println!("{:?}", vec);
}
but the compiler says
error[E0594]: cannot assign to `vec[_]`, which is behind a `&` reference
Can anyone help me to fix this?
You could get a Vec<u8> out of the String with the into_bytes method. An even better way, though, may be to iterate over the String's bytes with the bytes method, do the maths on the fly, and then collect the result:
fn main() {
let s = "abcdefg";
let vec: Vec<u8> = s.bytes().map(|b| b - b'a').collect();
println!("{:?}", vec); // [0, 1, 2, 3, 4, 5, 6]
}
But as #SvenMarnach correctly points out, this won't re-use s's buffer but allocate a new one. So, unless you need s again, the into_bytes method will be more efficient.
Strings in Rust are encoded in UTF-8. The (safe) interface of the String type enforces that the underlying buffer always is valid UTF-8, so it can't allow direct arbitrary byte modifications. However, you can convert a String into a Vec<u8> using the into_bytes() mehod. You can then modify the vector, and potentially convert it back to a string using String::from_utf8() if desired. The last step will verify that the buffer still is vaid UTF-8, and will fail if it isn't.
Instead of modifying the bytes of the string, you could also consider modifying the characters, which are potentially encoded by multiple bytes in the UTF-8 encoding. You can iterate over the characters of the string using the chars() method, convert each character to whatever you want, and then collect into a new string, or alternatively into a vector of integers, depending on your needs.
To understand what's going on, check the type of the vec variable. If you don't have an IDE/editor that can display the type to you, you can do this:
let mut vec: () = (s.as_bytes()).clone();
The resulting error message is explanative:
3 | let mut vec: () = (s.as_bytes()).clone();
| -- ^^^^^^^^^^^^^^^^^^^^^^ expected `()`, found `&[u8]`
So, what's happening is that the .clone() simply cloned the reference returned by as_bytes(), and didn't create a Vec<u8> from the &[u8]. In general, you can use .to_owned() in this kind of case, but in this specific case, using .into_bytes() on the String is best.
I couldn't find a simple thing on google. How to convert a simple Rust array to a pointer?
How to get pointer to [u8; 3]? I tried doing as *mut u8 but it wouldn't work.
Use as_ptr() or as_mut_ptr().
fn main() {
let a: [u8; 3] = [1, 2, 3];
println!("{:p}", a.as_ptr());
}
0x7ffc97350edd
Arrays coerce to slices, so any slice method may be called on an array.
Note that arrays in Rust are just blobs of memory. They does not point on some stored objects, like an arrays in C do, they are a sequence of objects.
If you have some data and want to get a pointer to it, you'll usually create a reference instead, since only references (and other pointers) can be cast to pointers with as:
fn main() {
let a: [u8; 3] = [1, 2, 3]; // a blob of data on the stack...
let a_ref = &a; // a shared reference to this data...
let a_ptr = a_ref as *const u8; // and a pointer, created from the reference
println!("{:p}", a_ptr);
}
Playground
This question already has an answer here:
Why is assigning an integer value from a vector to another variable allowed in Rust?
(1 answer)
Closed 2 years ago.
Briefly, some data types are stored on the stack, as the compiler knows how much memory they will require at run time. Other data types are more flexible, and are stored in the heap. The Pointer of the data stays on the stack, pointing to the heap data.
My question is, if the Vec data are on the heap, how is it that i32 (and other normally stack-stored types) can be accessed as if the actually were on the stack (copied by indexing).
In other words. It makes sense to me that I cannot move out String from the Vec, they don't implement Copy and are normally move. The same happens whem they are element of a Vec. However, i32 is normally copied, but why does this happen also when they are part of the vector data on the heap?
Please feel free to point out any conceptual error and point me to existing material if you think I missed someting. I have read The Rust Programming Language and checked around a bit.
fn main() {
// int in stack
let i: i32 = 1;
let _ic = i;
println!("{}", i);
// String on heap
let s: String = String::from("ciao cippina");
let _sc = &s;
println!("{}", s);
// array and data on the stack
let ari = [1, 2, 3];
println!("{:?}", &ari);
println!("a 0 {}", ari[0]);
// array and Pointers on the stack, data on the heap
let ars = [String::from("ciao"), String::from("mondo")];
println!("{:?}", &ars);
println!("a 0 {}", ars[0]);
// let _ars_1 = ars[0]; // ERROR, cannot move out of array
// Vec int, its Pointer on stack, all the rest on heap
let veci = vec![2, 4, 5, 6];
println!("{:?}", &veci);
println!("a 0 {}", veci[0]);
let _veci_1 = veci[0]; // NO ERROR HERE ??
// Vec string, its Pointer on stack, all the rest on heap
let vecs = vec![String::from("ciao"), String::from("mondo")];
println!("{:?}", &vecs);
println!("a 0 {}", vecs[0]);
// let _vecs_1 = vecs[0]; // ERROR, cannot move out of Vec
}
Just because element of a vector lives on a heap doesn't mean that compiler can't know the size of the element. It doesn't matter where element lives, if a type is "copyable", it can be copied from stack -> heap and vice-versa.
In your case, i32 occupies 4 bytes whether on heap or on stack (ignores alignment concerns)
This question already has answers here:
How do I convert a Vec<T> to a Vec<U> without copying the vector?
(2 answers)
Closed 3 years ago.
Is there a better way to cast Vec<i8> to Vec<u8> in Rust except for these two?
creating a copy by mapping and casting every entry
using std::transmute
The (1) is slow, the (2) is "transmute should be the absolute last resort" according to the docs.
A bit of background maybe: I'm getting a Vec<i8> from the unsafe gl::GetShaderInfoLog() call and want to create a string from this vector of chars by using String::from_utf8().
The other answers provide excellent solutions for the underlying problem of creating a string from Vec<i8>. To answer the question as posed, creating a Vec<u8> from data in a Vec<i8> can be done without copying or transmuting the vector. As pointed out by #trentcl, transmuting the vector directly constitutes undefined behavior because Vec is allowed to have different layout for different types.
The correct (though still requiring the use of unsafe) way to transfer a vector's data without copying it is:
obtain the *mut i8 pointer to the data in the vector, along with its length and capacity
leak the original vector to prevent it from freeing the data
use Vec::from_raw_parts to build a new vector, giving it the pointer cast to *mut u8 - this is the unsafe part, because we are vouching that the pointer contains valid and initialized data, and that it is not in use by other objects, and so on.
This is not UB because the new Vec is given the pointer of the correct type from the start. Code (playground):
fn vec_i8_into_u8(v: Vec<i8>) -> Vec<u8> {
// ideally we'd use Vec::into_raw_parts, but it's unstable,
// so we have to do it manually:
// first, make sure v's destructor doesn't free the data
// it thinks it owns when it goes out of scope
let mut v = std::mem::ManuallyDrop::new(v);
// then, pick apart the existing Vec
let p = v.as_mut_ptr();
let len = v.len();
let cap = v.capacity();
// finally, adopt the data into a new Vec
unsafe { Vec::from_raw_parts(p as *mut u8, len, cap) }
}
fn main() {
let v = vec![-1i8, 2, 3];
assert!(vec_i8_into_u8(v) == vec![255u8, 2, 3]);
}
transmute on a Vec is always, 100% wrong, causing undefined behavior, because the layout of Vec is not specified. However, as the page you linked also mentions, you can use raw pointers and Vec::from_raw_parts to perform this correctly. user4815162342's answer shows how.
(std::mem::transmute is the only item in the Rust standard library whose documentation consists mostly of suggestions for how not to use it. Take that how you will.)
However, in this case, from_raw_parts is also unnecessary. The best way to deal with C strings in Rust is with the wrappers in std::ffi, CStr and CString. There may be better ways to work this in to your real code, but here's one way you could use CStr to borrow a Vec<c_char> as a &str:
const BUF_SIZE: usize = 1000;
let mut info_log: Vec<c_char> = vec![0; BUF_SIZE];
let mut len: usize;
unsafe {
gl::GetShaderInfoLog(shader, BUF_SIZE, &mut len, info_log.as_mut_ptr());
}
let log = Cstr::from_bytes_with_nul(info_log[..len + 1])
.expect("Slice must be nul terminated and contain no nul bytes")
.to_str()
.expect("Slice must be valid UTF-8 text");
Notice there is no unsafe code except to call the FFI function; you could also use with_capacity + set_len (as in wasmup's answer) to skip initializing the Vec to 1000 zeros, and use from_bytes_with_nul_unchecked to skip checking the validity of the returned string.
See this:
fn get_compilation_log(&self) -> String {
let mut len = 0;
unsafe { gl::GetShaderiv(self.id, gl::INFO_LOG_LENGTH, &mut len) };
assert!(len > 0);
let mut buf = Vec::with_capacity(len as usize);
let buf_ptr = buf.as_mut_ptr() as *mut gl::types::GLchar;
unsafe {
gl::GetShaderInfoLog(self.id, len, std::ptr::null_mut(), buf_ptr);
buf.set_len(len as usize);
};
match String::from_utf8(buf) {
Ok(log) => log,
Err(vec) => panic!("Could not convert compilation log from buffer: {}", vec),
}
}
See ffi:
let s = CStr::from_ptr(strz_ptr).to_str().unwrap();
Doc
I am reading raw data from a file and I want to convert it to an integer:
fn main() {
let buf: &[u8] = &[0, 0, 0, 1];
let num = slice_to_i8(buf);
println!("1 == {}", num);
}
pub fn slice_to_i8(buf: &[u8]) -> i32 {
unimplemented!("what should I do here?")
}
I would do a cast in C, but what do I do in Rust?
I'd suggest using the byteorder crate (which also works in a no-std environment):
use byteorder::{BigEndian, ReadBytesExt}; // 1.2.7
fn main() {
let mut buf: &[u8] = &[0, 0, 0, 1];
let num = buf.read_u32::<BigEndian>().unwrap();
assert_eq!(1, num);
}
This handles oddly-sized slices and automatically advances the buffer so you can read multiple values.
As of Rust 1.32, you can also use the from_le_bytes / from_be_bytes / from_ne_bytes inherent methods on integers:
fn main() {
let buf = [0, 0, 0, 1];
let num = u32::from_be_bytes(buf);
assert_eq!(1, num);
}
These methods only handle fixed-length arrays to avoid dealing with the error when not enough data is present. If you have a slice, you will need to convert it into an array.
See also:
How to get a slice as an array in Rust?
How to convert a slice into an array reference?
I'd like to give this answer here to commit the following additional details:
A working code snippet which converts slice to integer (two ways to do it).
A working solution in no_std environment.
To keep everything in one place for the people getting here from the search engine.
Without external crates, the following methods are suitable to convert from slices to integer even for no_std build starting from Rust 1.32:
Method 1 (try_into + from_be_bytes)
use core::convert::TryInto;
let src = [1, 2, 3, 4, 5, 6, 7];
// 0x03040506
u32::from_be_bytes(src[2..6].try_into().unwrap());
use core::conver::TryInto is for no_std build. And the way to use the standard crate is the following: use std::convert::TryInto;.
(And about endians it has been already answered, but let me keep it here in one place: from_le_bytes, from_be_bytes, and from_ne_bytes - use them depending on how integer is represented in memory).
Method 2 (clone_from_slice + from_be_bytes)
let src = [1, 2, 3, 4, 5, 6, 7];
let mut dst = [0u8; 4];
dst.clone_from_slice(&src[2..6]);
// 0x03040506
u32::from_be_bytes(dst);
Result
In both cases integer will be equal to 0x03040506.
This custom serialize_deserialize_u8_i32 library will safely convert back and forth between u8 array and i32 array i.e. the serialise function will take all of your u8 values and pack them into i32 values and the deserialise function will take this library’s custom i32 values and convert them back to the original u8 values that you started with.
This was built for a specific purpose, however it may come in handy for other uses; depending on whether you want/need a fast/custom converter like this.
https://github.com/second-state/serialize_deserialize_u8_i32
Here’s my implementation (for a different use case) that discards any additional bytes beyond 8 (and therefore doesn’t need to panic if not exact):
pub fn u64_from_slice(slice: &[u8]) -> u64 {
u64::from_ne_bytes(slice.split_at(8).0.try_into().unwrap())
}
The split_at() method returns a tuple of two slices: one from index 0 until the specified index and the other from the specified index until the end. So by using .0 to access the first member of the tuple returned by .split_at(8), it ensures that only the first 8 bytes are passed to u64::to_ne_bytes(), discarding the leftovers. Then, of course, it calls the try_into method on that .0 tuple member, and .unwrap() since split_at does all the custom panicking for you.