I'm trying to format an u64 to a &str, but without dynamically allocating any memory on heap. I want to manually declare a space on stack (e.g., let mut buffer = [0u8; 20] and print the u64 to buffer and get a &str from it with some unsafe.
I tryied write!(&mut buffer[..], "{}", i), but it returns a Result<()> and I couldn't get the length of the formatted string so as to unsafely convert it to &str.
I'm currently straightly coping the implementation of Display for u64 from std library, is there a better way of doing so?
You could use a Cursor:
use std::io::{Write, Cursor};
fn main() {
let mut cursor = Cursor::new([0u8; 20]);
let i = 42u64;
write!(cursor, "{i}").unwrap();
let pos = cursor.position();
let buffer = cursor.into_inner();
let text = std::str::from_utf8(&buffer[..pos as usize]).unwrap();
println!("{text}");
}
Related
i want to convert u32 into ASCII bytes.
input: 1u32
output [49]
This was my try, but its empty with 0u32 and also using Vec, i would prefer ArrayVec but how do i know the size of the number. Is there any simple way to do this , without using any dynamic allocations?
let mut num = 1u32;
let base = 10u32;
let mut a: Vec<char> = Vec::new();
while num != 0 {
let chars = char::from_digit(num % base,10u32).unwrap();
a.push(chars);
num /= base;
}
let mut vec_of_u8s: Vec<u8> = a.iter().map(|c| *c as u8).collect();
vec_of_u8s.reverse();
println!("{:?}",vec_of_u8s)
Use the write! macro and ArrayVec with the capacity set to 10 (the maximum digits of a u32):
use std::io::Write;
use arrayvec::ArrayVec; // 0.7.2
fn main() {
let input = 1u32;
let mut buffer = ArrayVec::<u8, 10>::new();
write!(buffer, "{}", input).unwrap();
dbg!(buffer);
}
[src/main.rs:10] buffer = [
49,
]
This question already has answers here:
How do I convert a Vec<T> to a Vec<U> without copying the vector?
(2 answers)
Closed 3 years ago.
Is there a better way to cast Vec<i8> to Vec<u8> in Rust except for these two?
creating a copy by mapping and casting every entry
using std::transmute
The (1) is slow, the (2) is "transmute should be the absolute last resort" according to the docs.
A bit of background maybe: I'm getting a Vec<i8> from the unsafe gl::GetShaderInfoLog() call and want to create a string from this vector of chars by using String::from_utf8().
The other answers provide excellent solutions for the underlying problem of creating a string from Vec<i8>. To answer the question as posed, creating a Vec<u8> from data in a Vec<i8> can be done without copying or transmuting the vector. As pointed out by #trentcl, transmuting the vector directly constitutes undefined behavior because Vec is allowed to have different layout for different types.
The correct (though still requiring the use of unsafe) way to transfer a vector's data without copying it is:
obtain the *mut i8 pointer to the data in the vector, along with its length and capacity
leak the original vector to prevent it from freeing the data
use Vec::from_raw_parts to build a new vector, giving it the pointer cast to *mut u8 - this is the unsafe part, because we are vouching that the pointer contains valid and initialized data, and that it is not in use by other objects, and so on.
This is not UB because the new Vec is given the pointer of the correct type from the start. Code (playground):
fn vec_i8_into_u8(v: Vec<i8>) -> Vec<u8> {
// ideally we'd use Vec::into_raw_parts, but it's unstable,
// so we have to do it manually:
// first, make sure v's destructor doesn't free the data
// it thinks it owns when it goes out of scope
let mut v = std::mem::ManuallyDrop::new(v);
// then, pick apart the existing Vec
let p = v.as_mut_ptr();
let len = v.len();
let cap = v.capacity();
// finally, adopt the data into a new Vec
unsafe { Vec::from_raw_parts(p as *mut u8, len, cap) }
}
fn main() {
let v = vec![-1i8, 2, 3];
assert!(vec_i8_into_u8(v) == vec![255u8, 2, 3]);
}
transmute on a Vec is always, 100% wrong, causing undefined behavior, because the layout of Vec is not specified. However, as the page you linked also mentions, you can use raw pointers and Vec::from_raw_parts to perform this correctly. user4815162342's answer shows how.
(std::mem::transmute is the only item in the Rust standard library whose documentation consists mostly of suggestions for how not to use it. Take that how you will.)
However, in this case, from_raw_parts is also unnecessary. The best way to deal with C strings in Rust is with the wrappers in std::ffi, CStr and CString. There may be better ways to work this in to your real code, but here's one way you could use CStr to borrow a Vec<c_char> as a &str:
const BUF_SIZE: usize = 1000;
let mut info_log: Vec<c_char> = vec![0; BUF_SIZE];
let mut len: usize;
unsafe {
gl::GetShaderInfoLog(shader, BUF_SIZE, &mut len, info_log.as_mut_ptr());
}
let log = Cstr::from_bytes_with_nul(info_log[..len + 1])
.expect("Slice must be nul terminated and contain no nul bytes")
.to_str()
.expect("Slice must be valid UTF-8 text");
Notice there is no unsafe code except to call the FFI function; you could also use with_capacity + set_len (as in wasmup's answer) to skip initializing the Vec to 1000 zeros, and use from_bytes_with_nul_unchecked to skip checking the validity of the returned string.
See this:
fn get_compilation_log(&self) -> String {
let mut len = 0;
unsafe { gl::GetShaderiv(self.id, gl::INFO_LOG_LENGTH, &mut len) };
assert!(len > 0);
let mut buf = Vec::with_capacity(len as usize);
let buf_ptr = buf.as_mut_ptr() as *mut gl::types::GLchar;
unsafe {
gl::GetShaderInfoLog(self.id, len, std::ptr::null_mut(), buf_ptr);
buf.set_len(len as usize);
};
match String::from_utf8(buf) {
Ok(log) => log,
Err(vec) => panic!("Could not convert compilation log from buffer: {}", vec),
}
}
See ffi:
let s = CStr::from_ptr(strz_ptr).to_str().unwrap();
Doc
I have something that is Read; currently it's a File. I want to read a number of bytes from it that is only known at runtime (length prefix in a binary data structure).
So I tried this:
let mut vec = Vec::with_capacity(length);
let count = file.read(vec.as_mut_slice()).unwrap();
but count is zero because vec.as_mut_slice().len() is zero as well.
[0u8;length] of course doesn't work because the size must be known at compile time.
I wanted to do
let mut vec = Vec::with_capacity(length);
let count = file.take(length).read_to_end(vec).unwrap();
but take's receiver parameter is a T and I only have &mut T (and I'm not really sure why it's needed anyway).
I guess I can replace File with BufReader and dance around with fill_buf and consume which sounds complicated enough but I still wonder: Have I overlooked something?
Like the Iterator adaptors, the IO adaptors take self by value to be as efficient as possible. Also like the Iterator adaptors, a mutable reference to a Read is also a Read.
To solve your problem, you just need Read::by_ref:
use std::io::Read;
use std::fs::File;
fn main() {
let mut file = File::open("/etc/hosts").unwrap();
let length = 5;
let mut vec = Vec::with_capacity(length);
file.by_ref().take(length as u64).read_to_end(&mut vec).unwrap();
let mut the_rest = Vec::new();
file.read_to_end(&mut the_rest).unwrap();
}
1. Fill-this-vector version
Your first solution is close to work. You identified the problem but did not try to solve it! The problem is that whatever the capacity of the vector, it is still empty (vec.len() == 0). Instead, you could actually fill it with empty elements, such as:
let mut vec = vec![0u8; length];
The following full code works:
#![feature(convert)] // needed for `as_mut_slice()` as of 2015-07-19
use std::fs::File;
use std::io::Read;
fn main() {
let mut file = File::open("/usr/share/dict/words").unwrap();
let length: usize = 100;
let mut vec = vec![0u8; length];
let count = file.read(vec.as_mut_slice()).unwrap();
println!("read {} bytes.", count);
println!("vec = {:?}", vec);
}
Of course, you still have to check whether count == length, and read more data into the buffer if that's not the case.
2. Iterator version
Your second solution is better because you won't have to check how many bytes have been read, and you won't have to re-read in case count != length. You need to use the bytes() function on the Read trait (implemented by File). This transform the file into a stream (i.e an iterator). Because errors can still happen, you don't get an Iterator<Item=u8> but an Iterator<Item=Result<u8, R::Err>>. Hence you need to deal with failures explicitly within the iterator. We're going to use unwrap() here for simplicity:
use std::fs::File;
use std::io::Read;
fn main() {
let file = File::open("/usr/share/dict/words").unwrap();
let length: usize = 100;
let vec: Vec<u8> = file
.bytes()
.take(length)
.map(|r: Result<u8, _>| r.unwrap()) // or deal explicitly with failure!
.collect();
println!("vec = {:?}", vec);
}
You can always use a bit of unsafe to create a vector of uninitialized memory. It is perfectly safe to do with primitive types:
let mut v: Vec<u8> = Vec::with_capacity(length);
unsafe { v.set_len(length); }
let count = file.read(vec.as_mut_slice()).unwrap();
This way, vec.len() will be set to its capacity, and all bytes in it will be uninitialized (likely zeros, but possibly some garbage). This way you can avoid zeroing the memory, which is pretty safe for primitive types.
Note that read() method on Read is not guaranteed to fill the whole slice. It is possible for it to return with number of bytes less than the slice length. There are several RFCs on adding methods to fill this gap, for example, this one.
This question already has answers here:
How to convert Vec<char> to a string
(2 answers)
Closed 6 years ago.
I've got a Vec<char> that I need to turn into a &str or String, but I'm unsure of the best way to do this. I've looked around and every resource I've found seems to be out-dated in some way. The answers in this question don't seem to be applicable for the newest build.
I'm using the nightly for 2015-3-19
The iterator based approach with .collect should work, after updating for language changes:
char_vector.iter().cloned().collect::<String>();
(I've chosen to replace .map(|c| *c) with .cloned() but either works.)
If your vector can be consumed, you can also use into_iter to avoid the clone
fn main() {
let char_vector = vec!['h', 'e', 'l', 'l', 'o'];
let str: String = char_vector.into_iter().collect();
println!("{}", str);
}
You can convert the Vec into a String without doing any allocations. It requires quite some unsafe code though:
#![feature(raw, unicode)]
use std::raw::Repr;
use std::slice::from_raw_parts_mut;
fn inplace_to_string(v: Vec<char>) -> String {
unsafe {
let mut i = 0;
{
let ch_v = &v[..];
let r = ch_v.repr();
let p: &mut [u8] = from_raw_parts_mut(r.data as *mut u8, r.len*4);
for ch in ch_v {
i += ch.encode_utf8(&mut p[i..i+4]).unwrap();
}
}
let p = v.as_ptr();
let cap = v.capacity()*4;
std::mem::forget(v);
let v = Vec::from_raw_parts(p as *mut u8, i, cap);
String::from_utf8_unchecked(v)
}
}
fn main() {
let char_vector = vec!['h', 'ä', 'l', 'l', 'ö'];
let str: String = char_vector.iter().cloned().collect();
let str2 = inplace_to_string(char_vector);
println!("{}", str);
println!("{}", str2);
}
PlayPen
Detailed Explanation
This creates a mutable u8 slice and a char slice simultaneously to the same buffer (breaking all Rust guarantees). Note that the u8 slice is four times as large as the char slice, since char always takes up 4 bytes.
let ch_v = &v[..];
let r = ch_v.repr();
let v: &mut [u8] = from_raw_parts_mut(r.data as *mut u8, r.len*4);
We need that to iterate over the unicode chars and replace them by their utf8 encoded counterpart. Since utf8 is always shorter or the same length as unicode, we can guarantee that we never overwrite any part we haven't read yet.
for ch in ch_v {
i += ch.encode_utf8(&mut v[i..i+4]).unwrap();
}
Since a char is always unicode and our buffer is always exactly 4 bytes (which is the maximum number of bytes a utf8 encoded unicode char will need), we can encode our chars to utf8 without checking if it worked (it will always work). The encode_utf8 function returns the length of the utf8 representation. Our index i is the location of the last written utf8 char.
Finally we need to do some cleaning up. Our vector is still of type Vec<char>. We get all the info we need (Pointer to the heap allocated array and the capacity)
let p = v.as_ptr();
let cap = v.capacity()*4;
Then we release the previous vector from all obligations like freeing memory.
std::mem::forget(v);
and finally recreate the u8 vector with correct length and capacity, and directly turn it into a String. The conversion to String does not need to be checked, as we already know the utf8 is correct, since the original Vec<char> could only contain correct unicode chars.
let v = Vec::from_raw_parts(p as *mut u8, i, cap);
String::from_utf8_unchecked(v)
To be able to use C library, I need to give a *mut c_char parameter to a function. But I don't find a way to have it from a str.
I converted my str to a CString, that's ok, but there's no more way from CString to get a *mut c_char in the nightly build. I found that in 0.12.0 there was a method, but now, what is the process to get that *mut c_char?
let bytes = String::from_str("Test").into_bytes() + b"\0";
let cchars = bytes.map_in_place(|b| b as c_char);
let name: *mut c_char = cchars.as_mut_ptr();
The basic idea is the same as yours but there is no need to slice the Vec explicitly; also a zero byte is appended to the buffer. See also my comment to the question.
Five years later and none of these solutions are working (for me, at least).
A slightly modified version of this answer:
let string: &str = "Hello, world!";
let bytes: Vec<u8> = String::from(string).into_bytes();
let mut c_chars: Vec<i8> = bytes.iter().map(| c | *c as i8).collect::<Vec<i8>>();
c_chars.push(0); // null terminator
let ptr: *mut c_char = c_chars.as_mut_ptr();
Look at this piece of documentation, the fn as_mut_ptr() has been moved to the slice part of the API. So you need a mutable slice of type &mut [c_char]. And AFAIK you cannot get that from a CString, those would be read-only.
Instead you can use a mut Vec<c_char>:
let mut x : Vec<c_char> = ...;
let slice = x.as_mut_slice();
let ptr = slice.as_mut_ptr();
Now (2022), into_raw() works well for me, for example:
use std::ffi::CString;
use std::os::raw::c_char;
fn main() {
let c_string = CString::new("Hello!").expect("CString::new failed");
let raw: *mut c_char = c_string.into_raw();
}
Check https://doc.rust-lang.org/std/ffi/struct.CString.html#method.into_raw for more information.
Thanks to rodrigo's post, I found a way (very dirty, but it works) to solve my problem. Here is the code :
let mut string = std::string::String::from_str("Test");
let bytes = string.into_bytes();
let mut cchar : Vec<c_char> = bytes.map_in_place(|w| w as c_char);
let slice = cchar.as_mut_slice();
let name: *mut c_char = slice.as_mut_ptr();
A bit complex in my opinion
If you want to
let mut hello = b"Hello World\0".to_vec().as_mut_ptr() as *mut i8;