C library freeing a pointer coming from Rust - rust

I want to do Rust bindings to a C library which requires a callback, and this callback must return a C-style char* pointer to the C library which will then free it.
The callback must be in some sense exposed to the user of my library (probably using closures), and I want to provide a Rust interface as convenient as possible (meaning accepting a String output if possible).
However, the C library complains when trying to free() a pointer coming from memory allocated by Rust, probably because Rust uses jemalloc and the C library uses malloc.
So currently I can see two workarounds using libc::malloc(), but both of them have disadvantages:
Give the user of the library a slice that he must fill (inconvenient, and imposes length restrictions)
Take his String output, copy it to an array allocated by malloc, and then free the String (useless copy and allocation)
Can anybody see a better solution?
Here is an equivalent of the interface of the C library, and the implementation of the ideal case (if the C library could free a String allocated in Rust)
extern crate libc;
use std::ffi::CString;
use libc::*;
use std::mem;
extern "C" {
// The second parameter of this function gets passed as an argument of my callback
fn need_callback(callback: extern fn(arbitrary_data: *mut c_void) -> *mut c_char,
arbitrary_data: *mut c_void);
}
// This function must return a C-style char[] that will be freed by the C library
extern fn my_callback(arbitrary_data: *mut c_void) -> *mut c_char {
unsafe {
let mut user_callback: *mut &'static mut FnMut() -> String = mem::transmute(arbitrary_data); //'
let user_string = (*user_callback)();
let c_string = CString::new(user_string).unwrap();
let ret: *mut c_char = mem::transmute(c_string.as_ptr());
mem::forget(c_string); // To prevent deallocation by Rust
ret
}
}
pub fn call_callback(mut user_callback: &mut FnMut() -> String) {
unsafe {
need_callback(my_callback, mem::transmute(&mut user_callback));
}
}
The C part would be equivalent to this:
#include <stdlib.h>
typedef char* (*callback)(void *arbitrary_data);
void need_callback(callback cb, void *arbitrary_data) {
char *user_return = cb(arbitrary_data);
free(user_return); // Complains as the pointer has been allocated with jemalloc
}

It might require some annoying work on your part, but what about exposing a type that implements Write, but is backed by memory allocated via malloc? Then, your client can use the write! macro (and friends) instead of allocating a String.
Here's how it currently works with Vec:
let mut v = Vec::new();
write!(&mut v, "hello, world");
You would "just" need to implement the two methods and then you would have a stream-like interface.

Related

What is the correct way to drop a mem::forgotten array of strings sent to C via an out pointer and then passed back to Rust for deletion?

I would like to return some strings to C via a Rust FFI call. I also would like to ensure they're cleaned up properly.
I'm creating the strings on the Rust side and turning them into an address of an array of strings.
use core::mem;
use std::ffi::CString;
#[no_mangle]
pub extern "C" fn drop_rust_memory(mem: *mut ::libc::c_void) {
unsafe {
let boxed = Box::from_raw(mem);
mem::drop(boxed);
}
}
#[no_mangle]
pub extern "C" fn string_xfer(strings_out: *mut *mut *mut ::libc::c_char) -> usize {
unsafe {
let s1 = CString::new("String 1").unwrap();
let s2 = CString::new("String 2").unwrap();
let s1 = s1.into_raw();
let s2 = s2.into_raw();
let strs = vec![s1, s2];
let len = strs.len();
let mut boxed_slice = strs.into_boxed_slice();
*strings_out = boxed_slice.as_mut_ptr() as *mut *mut ::libc::c_char;
mem::forget(boxed_slice);
len
}
}
On the C side, I call the Rust FFI function, print the strings and then attempt to delete them via another Rust FFI call.
extern size_t string_xfer(char ***out);
extern void drop_rust_memory(void* mem);
int main() {
char **receiver;
int len = string_xfer(&receiver);
for (int i = 0; i < len; i++) {
printf("<%s>\n", receiver[i]);
}
drop_rust_memory(receiver);
printf("# rust memory dropped\n");
for (int i = 0; i < len; i++) {
printf("<%s>\n", receiver[i]);
}
return 0;
}
This appears to work. For the second printing after the drop, I would expect to get a crash or some undefined behavior, but I get this
<String 1>
<String 2>
# rust memory dropped
<(null)>
<String 2>
which makes me less sure about the entire thing.
First you may want take a look at Catching panic! when Rust called from C FFI, without spawning threads. Because panic will invoke undefined behaviour in this case. So you better catch the panic or avoid have code that can panic.
Secondly, into_boxed_slice() is primary use when you don't need vector feature any more so "A contiguous growable array type". You could also use as_mut_ptr() and forget the vector. That a choice either you want to carry the capacity information into C so you can make the vector grow or you don't want. (I think vector is missing a into_raw() method but I'm sure you can code one (just an example) to avoid critical code repetition). You could also use Box::into_raw() followed with a cast to transform the slice to pointer:
use std::panic;
use std::ffi::CString;
pub unsafe extern "C" fn string_xfer(len: &mut libc::size_t) -> Option<*mut *mut libc::c_char> {
if let Ok(slice) = panic::catch_unwind(move || {
let s1 = CString::new("String 1").unwrap();
let s2 = CString::new("String 2").unwrap();
let strs = vec![s1.into_raw(), s2.into_raw()];
Box::into_raw(strs.into_boxed_slice())
}) {
*len = (*slice).len();
Some(slice as _)
} else {
None
}
}
Third, your drop_rust_memory() only drop a pointer, I think you are doing a total UB here. Rust memory allocation need the real size of the allocation (capacity). And you didn't give the size of your slice, you tell to Rust "free this pointer that contain a pointer to nothing (void so 0)" but that not the good capacity. You need to use from_raw_parts_mut(), your C code must give the size of the slice to the Rust code. Also, you need to properly free your CString you need to call from_raw() to do it (More information here):
use std::ffi::CString;
pub unsafe extern "C" fn drop_rust_memory(mem: *mut *mut libc::c_char, len: libc::size_t) {
let slice = Box::from_raw(std::slice::from_raw_parts_mut(mem, len));
for &x in slice.iter() {
CString::from_raw(x);
} // CString will free resource don't use mem/vec element after
}
To conclude, you should read more about undefined behaviour, it's not about "expect a crash" or "something" should happen. When your program trigger a UB, everything can happen, you go into a random zone, read more about UB on this amazing LLVM blog post
Note about C style prefer return the pointer and not the size because strings_out: *mut *mut *mut ::libc::c_char is a ugly thing so do pub extern fn string_xfer(size: &mut libc::size_t) -> *mut *mut libc::c_char. Also, How to check if function pointer passed from C is non-NULL

Why does a Box pointer passed to C and back to Rust segfault?

Some C code calls into the Rust open call below which returns a pointer. Later the C code passes the exact same pointer back to the close function which tries to drop (free) it. It segfaults in free(3). Why?
use std::os::raw::{c_int, c_void};
struct Handle;
extern "C" fn open(_readonly: c_int) -> *mut c_void {
let h = Handle;
let h = Box::new(h);
return Box::into_raw(h) as *mut c_void;
}
extern "C" fn close(h: *mut c_void) {
let h = unsafe { Box::from_raw(h) };
// XXX This segfaults - why?
drop(h);
}
In close, you end up creating a Box<c_void> instead of a Box<Handle> because you didn't cast the *mut c_void back to *mut Handle before invoking Box::from_raw.
fn close(h: *mut c_void) {
let h = unsafe { Box::from_raw(h as *mut Handle) };
drop(h);
}
By the way, Box doesn't actually allocate any memory for a zero-sized type (such as Handle here) and uses a fixed, non-zero pointer value instead (which, in the current implementation, is the alignment of the type; a zero-sized type has an alignment of 1 by default). The destructor for a boxed zero-sized type knows not to try to deallocate memory at this fictitious memory address, but c_void is not a zero-sized type (it has size 1), so the destructor for Box<c_void> tries to free memory at address 0x1, which causes the segfault.
If Handle wasn't zero-sized, then the code may not crash, but it would still run the wrong destructor (it'd run c_void's destructor, which does nothing), and this may cause memory leaks. A destructor runs Drop::drop for the type if present, then drops the type's fields.
The problem is you didn't cast the pointer back to a Handle pointer while transforming it back to a Box, and got a Box of the wrong type.
This works:
fn close(h: *mut c_void) {
let h = unsafe { Box::from_raw(h as *mut Handle) };
// ^^^^^^^^^^^^^^
drop(h);
}
In your code, h is a std::boxed::Box<std::ffi::c_void>.

Create *mut *mut to a struct

I'm trying to call pthread_join with a pointer to my struct in order that the C thread can fill in the struct to the memory I point it to. (Yes, I'm aware that this is highly unsafe..)
The function signature of pthread_join:
pub unsafe extern fn pthread_join(native: pthread_t,
value: *mut *mut c_void)
-> c_int
I'm doing this as an exercise of porting C code from a book to Rust. The C code:
pthread_t tid1;
struct foo *fp;
err = pthread_create(&tid1, NULL, thr_fn1, NULL);
err = pthread_join(tid1, (void *)&fp);
I came up with this code:
extern crate libc;
use libc::{pthread_t, pthread_join};
struct Foo {}
fn main() {
let tid1:pthread_t = std::mem::uninitialized();
let mut fp:Box<Foo> = std::mem::uninitialized();
let value = &mut fp;
pthread_join(tid1, &mut value);
}
But the error I see is:
error[E0308]: mismatched types
--> src/bin/11-threads/f04-bogus-pthread-exit.rs:51:24
|
51 | pthread_join(tid1, &mut value);
| ^^^^^^^^^^ expected *-ptr, found mutable reference
|
= note: expected type `*mut *mut libc::c_void`
found type `&mut &mut std::boxed::Box<Foo>`
Is it even possible to achieve this just using casts, or do I need to transmute?
There are several issues here:
Box is a pointer to a heap-allocated resource, you can extract the pointer itself using Box::into_raw(some_box),
References are not silently coerced into pointers (even though they have the same representation), you need an explicit cast,
You need to cast from your concrete type to c_void, type inference may be able to do that
You have a reference to a reference to a pointer, you need a pointer to a pointer; you have one too many levels of indirection.
Let's make it work:
// pthread interface, reduced
struct Void;
fn sample(_: *mut *mut Void) {}
// actual code
struct Foo {}
fn main() {
let mut p = Box::into_raw(Box::new(Foo{})) as *mut Void;
sample(&mut p as *mut _);
}
Note that this is leaking memory (as a result of into_raw), normally the memory should be shoved back into a Box with from_raw for the destructor of Foo to be called and the memory to be freed.
The code can't work as written; that is because the C thread doesn't really "fill in the struct" in the memory you point to. It is responsible for allocating its own memory (or receiving it from another thread beforehand) and filling it out. The only thing the C thread "returns" is a single address, and this address is picked up by pthread_join.
This is why pthread_join receives a void **, i.e. the pointer to a void *. That kind of output parameter enables pthread_join to store (return) the void * pointer provided by the freshly finished thread. The thread can provide the pointer either by passing it to pthread_exit or by returning it from the start_routine passed to pthread_create. In Rust, the raw pointer can be received with code like this:
let mut c_result: *mut libc::c_void = ptr::null_mut();
libc::pthread_join(tid1, &mut c_result as *mut _);
// C_RESULT now contains the raw pointer returned by the worker's
// start routine, or passed to pthread_exit()
The contents and size of the memory that the returned pointer points to are a matter of contract between the thread being joined and the thread that is joining it. If the worker thread is implemented in C and designed to be invoked by other C code, then an obvious choice is for it to allocate memory for a result structure, fill it out, and provide a pointer to allocated memory. For example:
struct ThreadResult { ... };
...
ThreadResult *result = malloc(sizeof(struct ThreadResult));
result->field1 = value1;
...
pthread_exit(result);
In that case your Rust code that joins the thread can interpret the result by replicating the C structure and picking up its ownership:
// obtain a raw-pointer c_result through pthread_join as
// shown above:
let mut c_result = ...;
libc::pthread_join(tid1, &mut c_result as *mut _);
#[repr(C)]
struct ThreadResult { ... } // fields copy-pasted from C
unsafe {
// convert the raw pointer to a Rust reference, so that we may
// inspect its contents
let result = &mut *(c_result as *mut ThreadResult);
// ... inspect result.field1, etc ...
// free the memory allocated in the thread
libc::free(c_result);
// RESULT is no longer usable
}

Readline custom completer

I'm trying to write a readline custom completer (tab completion) in Rust. I think I have everything straight, but when I try it en vivo it heads off into the weeds and never comes back. Oddly, when I call it directly from main() I appear to get a valid pointer back. I never see a crash or panic in either case. Backtrace output is not consistent over runs (it's busy doing something). Perhaps one clue is that gdb indicates that the arguments passed to the completer are incorrect (although I'm not actually using them). Eg, after callback:
#2 0x00007f141f701272 in readlinetest::complete (text=0x7f141ff27d10 "", start=2704437, end=499122176) at src/main.rs:24
Or directly, breakpointing the call in main:
#0 readlinetest::complete (text=0x555555559190 <complete::hcda8d6cb2ef52a1bKaa> "dH;$%p", start=0, end=0) at src/main.rs:21
Do I have an ABI problem? Seems unlikely and the function signature isn't complicated :(
Here is a small test project: Cargo.toml:
[package]
name = "readlinetest"
version = "0.1.0"
authors = ["You <you#example.com>"]
[dependencies]
libc = "*"
readline = "*"
And main.rs:
extern crate libc;
extern crate readline;
use libc::{c_char, c_int};
use std::ffi::CString;
use std::process::exit;
use std::ptr;
use std::str;
extern { fn puts(s: *const libc::c_char); }
#[link(name = "readline")]
// Define the global in libreadline that will point to our completion function.
extern {
static mut rl_completion_entry_function: extern fn(text: *const c_char,
start: c_int,
end: c_int) -> *const *const c_char;
}
// Our completion function. Returns two strings.
extern fn complete(text: *const c_char, start: c_int, end: c_int) -> *const *const c_char {
let _ = text; let _ = start; let _ = end;
let mut words:Vec<*const c_char> =
vec!(CString::new("one").unwrap(), CString::new("two").unwrap()).
iter().
map(|arg| arg.as_ptr()).
collect();
words.push(ptr::null()); // append null
words.as_ptr() as *const *const c_char
}
fn main() {
let words = complete(string_to_mut_c_char("hi"), 1, 2);
unsafe { puts(*words) } // prints "one"
//unsafe { puts((*words + ?)) } // not sure hot to get to "two"
unsafe { rl_completion_entry_function = complete }
// Loop until EOF: echo input to stdout
loop {
if let Ok(input) = readline::readline_bare(&CString::new("> ").unwrap()) {
let text = str::from_utf8(&input.to_bytes()).unwrap();
println!("{}", text);
} else { // EOF/^D
exit(0)
}
}
}
// Just for testing
fn string_to_mut_c_char(s: &str) -> *mut c_char {
let mut bytes = s.to_string().into_bytes(); // Vec<u8>
bytes.push(0); // terminating null
let mut cchars = bytes.iter().map(|b| *b as c_char).collect::<Vec<c_char>>();
let name: *mut c_char = cchars.as_mut_ptr();
name
}
Ubuntu 14.04, 64 bit with Rust 1.3.
What am I missing? Thanks for any pointers (ha ha...).
and the function signature isn't complicated
It's not, but it does help to have the right one... ^_^ From my local version of readline (6.3.8):
extern rl_compentry_func_t *rl_completion_entry_function;
typedef char *rl_compentry_func_t PARAMS((const char *, int));
Additionally, you have multiple use after free errors:
vec![CString::new("one").unwrap()].iter().map(|s| s.as_ptr());
This creates a CString and gets the pointer to it. When the statement is done, nothing owns the vector that owns the strings. The vector will be immediately dropped, dropping the strings, invalidating the pointers.
words.as_ptr() as *const *const c_char
Similar thing here — you take the pointer, but then nothing owns the words vector anymore, so it is dropped, invalidating that pointer. So now you have an invalid pointer which attempts to point to a sequence of invalid pointers.
The same problem can be found in string_to_mut_c_char.
I don't know enough readline to understand who is supposed to own the returned strings, but it looks like you pass ownership to readline and it frees them. If so, that means you are going to have to use the same allocator that readline does so that it can free the strings for you. You will likely have to write some custom code that copies a CString's data using the appropriate allocator.
Style-wise, you can use underscores in variable names to indicate they are unused:
extern fn complete(_text: *const c_char, _start: c_int, _end: c_int)
There should be a space after : and there's no need to specify the type of the vector's contents:
let mut words: Vec<_>

static struct with C strings for lv2 plugin [duplicate]

This question already has answers here:
Creating a static C struct containing strings
(3 answers)
Closed 7 years ago.
I'm trying to learn Rust (newbie in low level programming), and want to translate a tiny lv2 amplifier (audio) plugin "amp.c" (C-code) from C to Rust. I actually got it working (here), but when the host terminates, valgrind says that "
64 bytes in 1 blocks are definitely lost". I think I know why this happens, but I don't know how to fix it.
Before you get tired of reading, here is the final question:
How do I statically allocate a struct that contains a C string?
And here is the introduction:
Why it happens (I think):
Host loads the library and calls lv2_descriptor()
const LV2_Descriptor*
lv2_descriptor()
{
return &descriptor;
}
which returns a pointer to a STATICALLY allocated struct of type LV2_Descriptor,
static const LV2_Descriptor descriptor = {
AMP_URI,
...
};
which is defined as
typedef struct _LV2_Descriptor {
const char * URI;
...
} LV2_Descriptor;
Why is it statically allocated? In the amp.c it says:
It is best to define descriptors statically to avoid leaking memory
and non-portable shared library constructors and destructors to clean
up properly.
However, I translated lv2_descriptor() to Rust as:
#[no_mangle]
pub extern fn lv2_descriptor(index:i32) -> *const LV2Descriptor {
let s = "http://example.org/eg-amp_rust";
let cstr = CString::new(s).unwrap();
let ptr = cstr.as_ptr();
mem::forget(cstr);
let mybox = Box::new(LV2Descriptor{amp_uri: ptr}, ...);
let bxptr = &*mybox as *const LV2Descriptor;
mem::forget(mybox);
return bxptr
}
So it's not statically allocated and I never free it, that's I guess why valgrind complains?
How am I trying to solve it?
I'm trying to do the same thing in Rust as the C-code does, i.e. statically allocate the struct (outside of lv2_descriptor()). The goal is to be fully compatible to the lv2 library, i.e "...to avoid leaking memory..." etc., as it says in the quote, right? So I tried something like:
static ptr1: *const u8 = (b"example.org/eg-amp_rust\n\0").as_ptr();
static ptr2: *const libc::c_char = ptr1 as *const libc::c_char;
static desc: LV2Descriptor = LV2Descriptor{amp_uri: ptr2, ...};
But this does not compile, there are error messages like
src/lib.rs:184:26: 184:72 error: the trait `core::marker::Sync` is not implemented for the type `*const u8` [E0277]
src/lib.rs:184 static ptr1: *const u8 = b"http://example.org/eg-amp_rust\n\0".as_ptr();
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
src/lib.rs:184:26: 184:72 note: `*const u8` cannot be shared between threads safely
src/lib.rs:184 static ptr1: *const u8 = b"http://example.org/eg-amp_rust\n\0".as_ptr();
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
src/lib.rs:184:26: 184:72 error: static contains unimplemented expression type [E0019]
src/lib.rs:184 static ptr1: *const u8 = b"http://example.org/eg-amp_rust\n\0".as_ptr();
Specific problem/question:
How do I statically allocate a struct that contains a C string?
The short answer is, you don't for now. Future Rust will probably gain this ability.
What you can do, is statically allocate a struct that contains null pointers, and set those null pointers to something useful when you call the function. Rust has static mut. It requires unsafe code, is not threadsafe at all and is (to the best of my knowledge) considered a code smell.
Right here I consider it a workaround to the fact that there is no way to turn a &[T] into a *const T in a static.
static S: &'static [u8] = b"http://example.org/eg-amp_rust\n\0";
static mut desc: LV2Descriptor = LV2Descriptor {
amp_uri: 0 as *const libc::c_char, // ptr::null() isn't const fn (yet)
};
#[no_mangle]
pub extern fn lv2_descriptor(index: i32) -> *const LV2Descriptor {
let ptr = S.as_ptr() as *const libc::c_char;
unsafe {
desc.amp_uri = ptr;
&desc as *const LV2Descriptor
}
}

Resources