I want to convert arrays.
Example:
func()-> *mut *mut f32;
...
let buffer = func();
for n in 0..48000 {
buffer[0][n] = 1.0;
buffer[1][n] = 3.0;
}
In Rust &[T]/&mut [T] is called a slice. A slice is not an array; it is a pointer to the beginning of an array and the number of items in this array. Therefore, to create &mut [T] out of *mut T, you need to known the length of the array behind the pointer.
*mut *mut T looks like a C implementation of a 2D, possibly jagged, array, i.e. an array of arrays (this is different from a contiguous 2D array, as you probably know). There is no free way to convert it to &mut [&mut [T]], because, as I said before, *mut T is one pointer-sized number, while &mut [T] is two pointer-sized numbers. So you can't, for example, transmute *mut T to &mut [T], it would be a size mismatch. Therefore, you can't simply transform *mut *mut f32 to &mut [&mut [f32]] because of the layout mismatch.
In order to safely work with numbers stored in *mut *mut f32, you need, first, determine the length of the outer array and lengths of all of the inner arrays. For simplicity, let's consider that they are all known statically:
const ROWS: usize = 48000;
const COLUMNS: usize = 48000;
Now, since you know the length, you can convert the outer pointer to a slice of raw pointers:
use std::slice;
let buffer: *mut *mut f32 = func();
let buf_slice: &mut [*mut f32] = unsafe {
slice::from_raw_parts_mut(buffer, ROWS);
};
Now you need to go through this slice and convert each item to a slice, collecting the results into a vector:
let matrix: Vec<&mut [f32]> = buf_slice.iter_mut()
.map(|p| unsafe { slice::from_raw_parts_mut(p, COLUMNS) })
.collect();
And now you can indeed access your buffer by indices:
for n in 0..COLUMNS {
matrix[0][n] = 1.0;
matrix[1][n] = 3.0;
}
(I have put explicit types on bindings for readability, most of them in fact can be omitted)
So, there are two main things to consider when converting raw pointers to slices:
you need to know exact length of the array to create a slice from it; if you know it, you can use slice::from_raw_parts() or slice::from_raw_parts_mut();
if you are converting nested arrays, you need to rebuild each layer of the indirection because pointers have different size than slices.
And naturally, you have to track who is the owner of the buffer and when it will be freed, otherwise you can easily get a slice pointing to a buffer which does not exist anymore. This is unsafe, after all.
Since your array seems to be an array of pointers to an array of 48000 f32s, you can simply use fixed size arrays ([T; N]) instead of slices ([T]):
fn func() -> *mut *mut f32 { unimplemented!() }
fn main() {
let buffer = func();
let buffer: &mut [&mut [f32; 48000]; 2] = unsafe { std::mem::transmute(buffer) };
for n in 0..48000 {
buffer[0][n] = 1.0;
buffer[1][n] = 3.0;
}
}
Related
How to copy a row of pixels in an i32 slice into an existing slice of pixels in an [u8] slice ?
Both slices are in the same memory layout (i.e. RGBA) but I don't know the unsafe syntax to copy one efficiently into the other. In C it would just be a memcpy().
You can flat_map the byte representation of each i32 into a Vec<u8>:
fn main() {
let pixels: &[i32] = &[-16776961, 16711935, 65535, -1];
let bytes: Vec<u8> = pixels
.iter()
.flat_map(|e| e.to_ne_bytes())
.collect();
println!("{bytes:?}");
}
There are different ways to handle the endianess of the system, I left to_ne_bytes to preserve the native order, but there are also to_le_bytes and to_be_bytes if that is something that needs to be controlled.
Alternatively, if you know the size of your pixel buffer ahead of time, you can use an unsafe transmute:
const BUF_LEN: usize = 4; // this is your buffer length
fn main() {
let pixels: [i32; BUF_LEN] = [-16776961, 16711935, 65535, -1];
let bytes = unsafe {
std::mem::transmute::<[i32; BUF_LEN], [u8; BUF_LEN * 4]>(pixels)
};
println!("{bytes:?}");
}
Assuming that you in fact do not need any byte reordering, the bytemuck library is the tool to use here, as it allows you to write the i32 to u8 reinterpretation without needing to consider safety (because bytemuck has checked it for you).
Specifically, bytemuck::cast_slice() will allow converting &[i32] to &[u8].
(In general, the function may panic if there is an alignment or size problem, but there never can be such a problem when converting to u8 or any other one-byte type.)
I have a vector holding n string slices. I would like to construct a string slice based on these.
fn main() {
let v: Vec<&str> = vec!["foo", "bar"];
let h: &str = "home";
let result = format!("hello={}#{}&{}#{}", v[0], h, v[1], h);
println!("{}", result);
}
I searched through the docs but I failed to find anything on this subject.
This can be done (somewhat inefficiently) with iterators:
let result = format!("hello={}",
v.iter().map(|s| format!("{}#{}", s, h))
.collect::<Vec<_>>()
.join("&")
);
(Playground)
If high performance is needed, a loop that builds a String will be quite a bit faster. The approach above allocates an additional String for each input &str, then a vector to hold them all before finally joining them together.
Here's a more efficient way to implement this. The operation carried out by this function is to call the passed function for each element in the iterator, giving it access to the std::fmt::Write reference passed in, and sticking the iterator in between successive calls. (Note that String implements std::fmt::Write!)
use std::fmt::Write;
fn delimited_write<W, I, V, F>(writer: &mut W, seq: I, delim: &str, mut func: F)
-> Result<(), std::fmt::Error>
where W: Write,
I: IntoIterator<Item=V>,
F: FnMut(&mut W, V) -> Result<(), std::fmt::Error>
{
let mut iter = seq.into_iter();
match iter.next() {
None => { },
Some(v) => {
func(writer, v)?;
for v in iter {
writer.write_str(delim)?;
func(writer, v)?;
}
},
};
Ok(())
}
You'd use it to implement your operation like so:
use std::fmt::Write;
fn main() {
let v: Vec<&str> = vec!["foo", "bar"];
let h: &str = "home";
let mut result: String = "hello=".to_string();
delimited_write(&mut result, v.iter(), "&", |w, i| {
write!(w, "{}#{}", i, h)
}).expect("write succeeded");
println!("{}", result);
}
It's not as pretty, but it makes no temporary String or Vec allocations. (Playground)
You will need to iterate over the vector as cdhowie suggests above. Let me explain why this is necessarily an O(n) problem and you can't create a single string slice from a vector of string slices without iterating over the vector:
Your vector only holds references to the strings; it doesn't hold the strings themselves. The strings are likely not stored contiguously in memory (only their references inside your vector are) so combining them into a single slice is not as simple as creating a slice that points to the beginning of the first string referenced in the vector and then extending the size of the slice.
Given that a &str is just an integer indicating the length of the slice and a pointer to a location in memory or the application binary where a str (essentially an array of char's) is stored, you can imagine that if the first &str in your vector references a string on the stack and the next one references a hardcoded string that is stored in the executable binary of the program, there is no way to create a single &str that points to both str's without copying at least one of them (in practice, probably both of them will be copied).
In order to get a single string slice from all of those &str's in your vector, you need to copy each of the str's they reference to a single, contiguous chunk of memory and then create a slice of that chunk. That copying requires iterating over the vector.
It is a common pattern to see this 'shortcut' code in rust:
unsafe fn any_as_u8_slice<T: Sized>(p: &T) -> &[u8] {
::std::slice::from_raw_parts(
(p as *const T) as *const u8,
::std::mem::size_of::<T>(),
)
}
ie. Given a struct, unsafely convert the underlying pointer to &[u8] to read the bytes.
However, is it valid to take the same approach when using Vec<T>?
For example, this appears to work:
use std::mem::size_of;
use std::slice::from_raw_parts;
#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct Point {
pub x: u8,
pub y: u8,
pub z: u8,
}
fn as_bytes(data: &[Point]) -> &[u8] {
unsafe {
let raw_pointer = data.as_ptr();
from_raw_parts(raw_pointer as *const u8, size_of::<Point>() * data.len())
}
}
fn main() {
let points = vec![Point{x: 0u8, y: 1u8, z: 2u8}, Point{x: 3u8, y: 4u8, z: 5u8}];
let slice = points.as_slice();
println!("{:?}", slice);
let bytes = as_bytes(slice);
println!("{:?}", bytes);
assert!(bytes.len() == 6);
assert!(bytes[0] == 0u8);
assert!(bytes[1] == 1u8);
assert!(bytes[2] == 2u8);
assert!(bytes[3] == 3u8);
assert!(bytes[4] == 4u8);
assert!(bytes[5] == 5u8);
}
...but is it reliable to assume that Vec<T> is represented as a single contiguous block of data this way?
The documentation on https://doc.rust-lang.org/std/vec/struct.Vec.html#capacity-and-reallocation says:
If a Vec has allocated memory, then the memory it points to is on the heap (as defined by the allocator Rust is configured to use by default), and its pointer points to len initialized, contiguous elements in order (what you would see if you coerced it to a slice), followed by capacity-len logically uninitialized, contiguous elements.
...but I'm not really sure if I understand what it means. Does this actually mean that for Vec<T> the underlying pointer is to a block of memory of length size_of::<T> * length of the Vec?
Yes, a Vec<T> can be made into something that can be treated as a pointer to a block of memory of length std::mem::size_of::<T>() times the length of Vec.
There is one caveat, as what you are actually interested in is the slice of T, which the Vec can provide; the Vec itself should be considered an implementation detail. Besides that:
A Vec<T> can deref to a slice [T]. Take that slice.
The Rust Reference defines that a slice has the same layout as the section of the Array they slice. So when we deref from a Vec<T> to a [T], this slice of length n is guaranteed to have the same memory layout as an array [T; n].
The Rust References defines the memory layout of an Array:
Arrays are laid out so that the nth element of the array is offset
from the start of the array by n * the size of the type bytes. An
array of [T; n] has a size of size_of::<T>() * n and the same
alignment of T.
We know n (from [T]) and we know "the size of the type bytes" (via mem::size_of<T>()). Since all members of an array must be fully initialized at all times, and given the two sentences from the paragraph above, we know it is safe to access all bytes up until mem::size_of<T>() * length of Vec (actually length of slice, which introduces the array memory layout rule).
To make use of all that, you should make sure that you get a slice of the Vec first, use as_ptr() on the slice, and cast the raw pointer you get. This ensures the sequence of definitions as above. Your fn as_bytes(data: &[Point]) -> &[u8] is exactly correct.
This question already has answers here:
How do I convert a Vec<T> to a Vec<U> without copying the vector?
(2 answers)
Closed 3 years ago.
Is there a better way to cast Vec<i8> to Vec<u8> in Rust except for these two?
creating a copy by mapping and casting every entry
using std::transmute
The (1) is slow, the (2) is "transmute should be the absolute last resort" according to the docs.
A bit of background maybe: I'm getting a Vec<i8> from the unsafe gl::GetShaderInfoLog() call and want to create a string from this vector of chars by using String::from_utf8().
The other answers provide excellent solutions for the underlying problem of creating a string from Vec<i8>. To answer the question as posed, creating a Vec<u8> from data in a Vec<i8> can be done without copying or transmuting the vector. As pointed out by #trentcl, transmuting the vector directly constitutes undefined behavior because Vec is allowed to have different layout for different types.
The correct (though still requiring the use of unsafe) way to transfer a vector's data without copying it is:
obtain the *mut i8 pointer to the data in the vector, along with its length and capacity
leak the original vector to prevent it from freeing the data
use Vec::from_raw_parts to build a new vector, giving it the pointer cast to *mut u8 - this is the unsafe part, because we are vouching that the pointer contains valid and initialized data, and that it is not in use by other objects, and so on.
This is not UB because the new Vec is given the pointer of the correct type from the start. Code (playground):
fn vec_i8_into_u8(v: Vec<i8>) -> Vec<u8> {
// ideally we'd use Vec::into_raw_parts, but it's unstable,
// so we have to do it manually:
// first, make sure v's destructor doesn't free the data
// it thinks it owns when it goes out of scope
let mut v = std::mem::ManuallyDrop::new(v);
// then, pick apart the existing Vec
let p = v.as_mut_ptr();
let len = v.len();
let cap = v.capacity();
// finally, adopt the data into a new Vec
unsafe { Vec::from_raw_parts(p as *mut u8, len, cap) }
}
fn main() {
let v = vec![-1i8, 2, 3];
assert!(vec_i8_into_u8(v) == vec![255u8, 2, 3]);
}
transmute on a Vec is always, 100% wrong, causing undefined behavior, because the layout of Vec is not specified. However, as the page you linked also mentions, you can use raw pointers and Vec::from_raw_parts to perform this correctly. user4815162342's answer shows how.
(std::mem::transmute is the only item in the Rust standard library whose documentation consists mostly of suggestions for how not to use it. Take that how you will.)
However, in this case, from_raw_parts is also unnecessary. The best way to deal with C strings in Rust is with the wrappers in std::ffi, CStr and CString. There may be better ways to work this in to your real code, but here's one way you could use CStr to borrow a Vec<c_char> as a &str:
const BUF_SIZE: usize = 1000;
let mut info_log: Vec<c_char> = vec![0; BUF_SIZE];
let mut len: usize;
unsafe {
gl::GetShaderInfoLog(shader, BUF_SIZE, &mut len, info_log.as_mut_ptr());
}
let log = Cstr::from_bytes_with_nul(info_log[..len + 1])
.expect("Slice must be nul terminated and contain no nul bytes")
.to_str()
.expect("Slice must be valid UTF-8 text");
Notice there is no unsafe code except to call the FFI function; you could also use with_capacity + set_len (as in wasmup's answer) to skip initializing the Vec to 1000 zeros, and use from_bytes_with_nul_unchecked to skip checking the validity of the returned string.
See this:
fn get_compilation_log(&self) -> String {
let mut len = 0;
unsafe { gl::GetShaderiv(self.id, gl::INFO_LOG_LENGTH, &mut len) };
assert!(len > 0);
let mut buf = Vec::with_capacity(len as usize);
let buf_ptr = buf.as_mut_ptr() as *mut gl::types::GLchar;
unsafe {
gl::GetShaderInfoLog(self.id, len, std::ptr::null_mut(), buf_ptr);
buf.set_len(len as usize);
};
match String::from_utf8(buf) {
Ok(log) => log,
Err(vec) => panic!("Could not convert compilation log from buffer: {}", vec),
}
}
See ffi:
let s = CStr::from_ptr(strz_ptr).to_str().unwrap();
Doc
Is there a good way to convert a Vec<T> with size S to an array of type [T; S]? Specifically, I'm using a function that returns a 128-bit hash as a Vec<u8>, which will always have length 16, and I would like to deal with the hash as a [u8, 16].
Is there something built-in akin to the as_slice method which gives me what I want, or should I write my own function which allocates a fixed-size array, iterates through the vector copying each element, and returns the array?
Arrays must be completely initialized, so you quickly run into concerns about what to do when you convert a vector with too many or too few elements into an array. These examples simply panic.
As of Rust 1.51 you can parameterize over an array's length.
use std::convert::TryInto;
fn demo<T, const N: usize>(v: Vec<T>) -> [T; N] {
v.try_into()
.unwrap_or_else(|v: Vec<T>| panic!("Expected a Vec of length {} but it was {}", N, v.len()))
}
As of Rust 1.48, each size needs to be a specialized implementation:
use std::convert::TryInto;
fn demo<T>(v: Vec<T>) -> [T; 4] {
v.try_into()
.unwrap_or_else(|v: Vec<T>| panic!("Expected a Vec of length {} but it was {}", 4, v.len()))
}
As of Rust 1.43:
use std::convert::TryInto;
fn demo<T>(v: Vec<T>) -> [T; 4] {
let boxed_slice = v.into_boxed_slice();
let boxed_array: Box<[T; 4]> = match boxed_slice.try_into() {
Ok(ba) => ba,
Err(o) => panic!("Expected a Vec of length {} but it was {}", 4, o.len()),
};
*boxed_array
}
See also:
How to get a slice as an array in Rust?
How do I get an owned value out of a `Box`?
Is it possible to control the size of an array using the type parameter of a generic?