Global static Vec with split exterior and interior mutability - rust

I have a global static Vec that represents a state. It seems to be no other solution than a global state (I am developing a library that can be used by a threaded program to make network connections and I don't want the program to manage any of the internal data - such as currently open sockets).
Example of what I have in mind (does not compile):
lazy_static! {
static ref SOCKETS: Vec<Connection> = Vec::new();
}
#[no_mangle]
pub extern fn ffi_connect(address: *const u8, length: usize) {
let address_str = unsafe { from_raw_parts(address, length) };
let conn = internal_connect(address_str);
// Now I need to lock all the Vec to mutate it
SOCKETS.push(conn);
}
#[no_mangle]
pub extern fn ffi_recv(index: usize, msg: *mut c_void, size: usize) -> usize {
let buf = unsafe { from_raw_parts(msg as *const u8, size) };
// Now I need to lock ONLY the specific "index" item to mutate it
let conn = SOCKETS.get_mut(index);
conn.read(buf)
}
#[no_mangle]
pub extern fn ffi_send(index: usize, msg: *mut c_void, size: usize) -> usize {
let buf = unsafe { from_raw_parts(msg as *const u8, size) };
// Now I need to lock ONLY the specific "index" item to mutate it
let conn = SOCKETS.get_mut(index);
conn.write(buf)
}
The question is how should I implement SOCKETS in order to be able to call ffi_recv and ffi_send from two threads?
I'm thinking that I have to have a RwLock outside the Vec, in order to be able to lock during ffi_connect (I don't care about blocking at that point) but get multiple immutable references during ffi_recv and ffi_send. Then, somehow I need to get the interior mutability of the object that the Vec is pointing to.
I DON'T want to be able to ffi_recv and ffi_send at the same time on the same object (this MUST throw an error)

I almost had the answer inside my question...
I just had to RwLock<Vec<RwLock<Connection>>>. In order to mutate the Vec itself, the outer write lock would be locked. In order to mutate an item of the Vec, the outer lock would be read blocked where RwLock allows multiple locks. Then the inner RwLock could be either read or write locked.
ffi_connect becomes:
#[no_mangle]
pub extern fn ffi_connect(address: *const u8, length: usize) {
let address_str = unsafe { from_raw_parts(address, length) };
let conn = internal_connect(address_str);
let mut socket_lock = SOCKETS.write().unwrap();
// Nobody can read or write SOCKETS right now
socket_lock.push(conn);
}
And ffi_recv becomes:
#[no_mangle]
pub extern fn ffi_recv(index: usize, msg: *mut c_void, size: usize) -> usize {
let buf = unsafe { from_raw_parts(msg as *const u8, size) };
// Now I need to lock ONLY the specific "index" item to mutate it
let socket_lock = SOCKETS.read().unwrap();
// SOCKETS can only be "read" locked right now
let mut conn = socket_lock.get(index).write().unwrap();
// Nobody else can read or write to this exact object
// SOCKETS remains readable though!
conn.read(buf)
}

Related

How to make a field lifetime same as the struct?

I am getting memory allocated by an external C function. I then convert the memory to a [u8] slice using std::slice::from_raw_parts(). To avoid repeated calls to from_raw_parts() I want to store the slice as a field. So far I have this.
struct ImageBuffer<'a> {
//Slice view of buffer allocated by C
pub bytes: &'a [u8],
}
But this is wrong right there. This says bytes has a lifetime larger than the ImageBuffer instance. This does not reflect the reality. The buffer is freed up from drop() and hence bytes should have a lifetime same as the struct instance. How do I model that?
With the current code it is easy to use after free.
impl <'a> ImageBuffer<'a> {
pub fn new() -> ImageBuffer<'a> {
let size: libc::size_t = 100;
let ptr = unsafe {libc::malloc(size)};
let bytes = unsafe {std::slice::from_raw_parts(ptr as *const u8, size)};
ImageBuffer {
bytes,
}
}
}
impl <'a> Drop for ImageBuffer<'a> {
fn drop(&mut self) {
unsafe {
libc::free(self.bytes.as_ptr() as *mut libc::c_void)
};
}
}
fn main() {
let bytes;
{
let img = ImageBuffer::new();
bytes = img.bytes;
}
//Use after free!
println!("Size: {}. First: {}", bytes.len(), bytes[0]);
}
I can solve this problem by writing a getter function for bytes.
First make bytes private.
struct ImageBuffer<'a> {
bytes: &'a [u8],
}
Then write this getter method. This establishes the fact that the returned bytes has a lifetime same as the struct instance.
impl <'a> ImageBuffer<'a> {
pub fn get_bytes(&'a self) -> &'a [u8] {
self.bytes
}
}
Now, a use after free will not be allowed. The following will not compile.
fn main() {
let bytes;
{
let img = ImageBuffer::new();
bytes = img.get_bytes();
}
println!("Size: {}. First: {}", bytes.len(), bytes[0]);
}
I find this solution deeply disturbing. The struct declaration is still conveying a wrong meaning (it still says bytes has a larger lifetime than the struct instance). The get_bytes() method counters that and conveys the correct meaning. I'm looking for an explanation of this situation and what the best way to handle it.
Lifetimes cannot be used to express what you are doing, and therefore you should not be using a reference, since references always use lifetimes.
Instead, store a raw pointer in your struct. It can be a slice pointer; just not a slice reference.
struct ImageBuffer {
bytes: *const [u8],
}
To create the pointer, convert it from the C pointer without involving references. To create a safe reference, do it in your getter (or a Deref implementation):
impl ImageBuffer {
pub fn new() -> ImageBuffer {
let size: libc::size_t = 100;
let ptr = unsafe {libc::malloc(size)};
assert!(!ptr.is_null());
let bytes = unsafe {
std::ptr::slice_from_raw_parts(ptr as *const u8, size)
};
ImageBuffer {
bytes,
}
}
pub fn get_bytes(&self) -> &[u8] {
// Safety: the pointer is valid until `*self` is dropped, and it
// cannot be dropped while it is borrowed by this reference
unsafe { &*self.bytes }
}
}
This is essentially what Box, the basic owning pointer type, does, except that your pointer was allocated by the C allocator and this needs to be freed using it too (using your Drop implementation).
All of this is the normal and routine thing to do to make a safe wrapper for an owning C pointer.

How to return Rust Vec of unknown size over FFI?

I have a Rust library that returns Vecs of size that cannot be predicted by the caller, and I'm looking for an elegant way to implement the FFI interface.
Since Vecs can't be passed over FFI, I understand I need to first return the length, and then fill a buffer with the contents.
Here's my current solution, which works by manually managing memory of the Vec and trusting the caller to
Compute the FfiVec
Create a buffer of appropriate size to copy into
Copy from the FfiVec, cleaning it up in the process
#[repr(C)]
pub struct FfiVec<T: Sized> {
ptr: *mut Vec<T>,
len: c_uint, // necessary for C process to know length of Rust Vec
}
impl<T: Sized> FfiVec<T> {
pub fn from_vec(v: Vec<T>) -> Self {
let len = v.len() as c_uint;
FfiVec {
ptr: Box::into_raw(Box::new(v)),
len,
}
}
pub fn drop(self) {
unsafe {
std::ptr::drop_in_place(self.ptr);
std::alloc::dealloc(self.ptr as *mut u8, Layout::new::<Vec<T>>())
}
}
}
#[no_mangle]
pub extern compute_my_vec() -> FfiVec<u8> {...}
#[no_mangle]
pub extern copy_from_my_vec(ffi_vec: FfiVec<u8>, buffer: *mut u8) -> {
let len = c_vec.len as usize;
let buffer_slice = unsafe { std::slice::from_raw_parts_mut(buffer, len) };
for (i, &item) in c_vec.slice().iter().take(len).enumerate() {
buffer_slice[i] = item;
}
c_vec.drop()
}
Is there a simpler way to do this? Or a common library that can do this for me?

Mapping a struct to an underlying buffer

I am attempting to map a simple struct to an underlying buffer as follows, where modifying the struct also modifies the buffer:
#[repr(C, packed)]
pub struct User {
id: u8,
username: [u8; 20],
}
fn main() {
let buf = [97u8; 21];
let mut u: User = unsafe { std::ptr::read(buf.as_ptr() as *const _) };
let buf_addr = &buf[0] as *const u8;
let id_addr = &u.id as *const u8;
println!("buf addr: {:p} id addr: {:p} id val: {}", buf_addr, id_addr, u.id);
assert_eq!(buf_addr, id_addr); // TODO addresses not equal
u.id = 10;
println!("id val: {}", u.id);
println!("{:?}", &buf);
assert_eq!(buf[0], u.id); // TODO buffer not updated
}
However the starting address of the buffer is different to the address of the first member in the struct and modifying the struct does not modify the buffer. What is wrong with the above example?
The struct only contains owned values. That means that, in order to construct one, you have to copy data into it. And that is exactly what you are doing when you use ptr::read.
But what you want to do (at least the code presented) is not possible. If you work around Rust's safety checks with unsafe code then you would have two mutable references to the same data, which is Undefined Behaviour.
You can, however, create a safe API of mutable "views" onto the data, something like this:
#[repr(C, packed)]
pub struct User {
id: u8,
username: [u8; 20],
}
pub struct RawUser {
buf: [u8; 21],
}
impl RawUser {
pub fn as_bytes_mut(&mut self) -> &mut [u8; 21] {
&mut self.buf
}
pub fn as_user_mut(&mut self) -> &mut User {
unsafe { &mut *(self.buf.as_mut_ptr() as *mut _) }
}
}
These accessors let you view the same data in different ways, while allowing the Rust borrow checker to enforce memory safety. Usage looks like this:
fn main() {
let buf = [97u8; 21];
let mut u: RawUser = RawUser { buf };
let user = u.as_user_mut();
user.id = 10;
println!("id val: {}", user.id); // id val: 10
let bytes = u.as_bytes_mut();
// it would be a compile error to try to access `user` here
assert_eq!(bytes[0], 10);
}

Creating a Vec in Rust from a C array pointer and safely freeing it?

I'm calling a C function from Rust which takes a null pointer as as an argument, then allocates some memory to point it to.
What is the correct way to efficiently (i.e. avoiding unnecessary copies) and safely (i.e. avoid memory leaks or segfaults) turn data from the C pointer into a Vec?
I've got something like:
extern "C" {
// C function that allocates an array of floats
fn allocate_data(data_ptr: *mut *const f32, data_len: *mut i32);
}
fn get_vec() -> Vec<f32> {
// C will set this to length of array it allocates
let mut data_len: i32 = 0;
// C will point this at the array it allocates
let mut data_ptr: *const f32 = std::ptr::null_mut();
unsafe { allocate_data(&mut data_ptr, &mut data_len) };
let data_slice = unsafe { slice::from_raw_parts(data_ptr as *const f32, data_len as usize) };
data_slice.to_vec()
}
If I understand correctly, .to_vec() will copy data from the slice into a new Vec, so the underlying memory will still need to be freed (as the underlying memory for the slice won't be freed when it's dropped).
What is the correct approach for dealing with the above?
can I create a Vec which takes ownership of the underlying memory, which is freed when the Vec is freed?
if not, where/how in Rust should I free the memory that the C function allocated?
anything else in the above that could/should be improved on?
can I create a Vec which takes ownership of the underlying memory, which is freed when the Vec is freed?
Not safely, no. You must not use Vec::from_raw_parts unless the pointer came from a Vec originally (well, from the same memory allocator). Otherwise, you will try to free memory that your allocator doesn't know about; a very bad idea.
Note that the same thing is true for String::from_raw_parts, as a String is a wrapper for a Vec<u8>.
where/how in Rust should I free the memory that the C function allocated?
As soon as you are done with it and no sooner.
anything else in the above that could/should be improved on?
There's no need to cast the pointer when calling slice::from_raw_parts
There's no need for explicit types on the variables
Use ptr::null, not ptr::null_mut
Perform a NULL pointer check
Check the length is non-negative
use std::{ptr, slice};
extern "C" {
fn allocate_data(data_ptr: *mut *const f32, data_len: *mut i32);
fn deallocate_data(data_ptr: *const f32);
}
fn get_vec() -> Vec<f32> {
let mut data_ptr = ptr::null();
let mut data_len = 0;
unsafe {
allocate_data(&mut data_ptr, &mut data_len);
assert!(!data_ptr.is_null());
assert!(data_len >= 0);
let v = slice::from_raw_parts(data_ptr, data_len as usize).to_vec();
deallocate_data(data_ptr);
v
}
}
fn main() {}
You didn't state why you need it to be a Vec, but if you never need to change the size, you can create your own type that can be dereferenced as a slice and drops the data when appropriate:
use std::{ptr, slice};
extern "C" {
fn allocate_data(data_ptr: *mut *const f32, data_len: *mut i32);
fn deallocate_data(data_ptr: *const f32);
}
struct CVec {
ptr: *const f32,
len: usize,
}
impl std::ops::Deref for CVec {
type Target = [f32];
fn deref(&self) -> &[f32] {
unsafe { slice::from_raw_parts(self.ptr, self.len) }
}
}
impl Drop for CVec {
fn drop(&mut self) {
unsafe { deallocate_data(self.ptr) };
}
}
fn get_vec() -> CVec {
let mut ptr = ptr::null();
let mut len = 0;
unsafe {
allocate_data(&mut ptr, &mut len);
assert!(!ptr.is_null());
assert!(len >= 0);
CVec {
ptr,
len: len as usize,
}
}
}
fn main() {}
See also:
How to convert a *const pointer into a Vec to correctly drop it?
Is it possible to call a Rust function taking a Vec from C?

How to expose a Rust `Vec<T>` to FFI?

I'm trying to construct a pair of elements:
array: *mut T
array_len: usize
array is intended to own the data
However, Box::into_raw will return *mut [T]. I cannot find any info on converting raw pointers to slices. What is its layout in memory? How do I use it from C? Should I convert to *mut T? If so, how?
If you just want some C function to mutably borrow the Vec, you can do it like this:
extern "C" {
fn some_c_function(ptr: *mut i32, len: ffi::size_t);
}
fn safe_wrapper(a: &mut [i32]) {
unsafe {
some_c_function(a.as_mut_ptr(), a.len() as ffi::size_t);
}
}
Of course, the C function shouldn't store this pointer somewhere else because that would break aliasing assumptions.
If you want to "pass ownership" of the data to C code, you'd do something like this:
use std::mem;
extern "C" {
fn c_sink(ptr: *mut i32, len: ffi::size_t);
}
fn sink_wrapper(mut vec: Vec<i32>) {
vec.shrink_to_fit();
assert!(vec.len() == vec.capacity());
let ptr = vec.as_mut_ptr();
let len = vec.len();
mem::forget(vec); // prevent deallocation in Rust
// The array is still there but no Rust object
// feels responsible. We only have ptr/len now
// to reach it.
unsafe {
c_sink(ptr, len as ffi::size_t);
}
}
Here, the C function "takes ownership" in the sense that we expect it to eventually return the pointer and length to Rust, for example, by calling a Rust function to deallocate it:
#[no_mangle]
/// This is intended for the C code to call for deallocating the
/// Rust-allocated i32 array.
unsafe extern "C" fn deallocate_rust_buffer(ptr: *mut i32, len: ffi::size_t) {
let len = len as usize;
drop(Vec::from_raw_parts(ptr, len, len));
}
Because Vec::from_raw_parts expects three parameters, a pointer, a size and a capacity, we either have to keep track of the capacity as well somehow, or we use Vec's shrink_to_fit before passing the pointer and length to the C function. This might involve a reallocation, though.
You could use [T]::as_mut_ptr to obtain the *mut T pointer directly from Vec<T>, Box<[T]> or any other DerefMut-to-slice types.
use std::mem;
let mut boxed_slice: Box<[T]> = vector.into_boxed_slice();
let array: *mut T = boxed_slice.as_mut_ptr();
let array_len: usize = boxed_slice.len();
// Prevent the slice from being destroyed (Leak the memory).
mem::forget(boxed_slice);

Resources