This question already has an answer here:
How can I call a raw address from Rust?
(1 answer)
Closed 3 years ago.
Hello people of the internet,
I'm struggeling to invoke a function that is stored in a libc::c_void-Pointer. I can't tell Rust that the pointer is invokable and I can't figure out how to.
I want to translate this C++ Code
void * malloc(size_t size) {
static void *(*real_malloc)(size_t) = nullptr;
if (real_malloc == nullptr) {
real_malloc = reinterpret_cast<void *(*)(size_t)> (dlsym(RTLD_NEXT, "malloc"));
}
// do some logging stuff
void * ptr = real_malloc(size);
return ptr;
}
to Rust.
#[no_mangle]
pub extern fn malloc(bytes: usize) {
let c_string = "malloc\0".as_mut_ptr() as *mut i8; // char array for libc
let real_malloc: *mut libc::c_void = libc::dlsym(libc::RTLD_NEXT, c_string);
return real_malloc(bytes);
}
That's my progress so far after 1h of searching on the internet and trying. I'm new to Rust and not yet familiar with Rust/FFI / Rust with libc. I tried a lot with unsafe{}, casts with as but I always stuck at the following problem:
return real_malloc(bytes);
^^^^^^^^^^^^^^^^^^ expected (), found *-ptr
Q1: How can I call the function behind the void-Pointer stored in real_malloc?
Q2: Is my Rust-String to C-String conversion feasible this way?
I figured it out! Perhaps there is a better way but it works.
The trick is to "cast" the void-Pointer to c-function-Type with std::mem::transmute since it won't work with as
type LibCMallocT = fn(usize) -> *mut libc::c_void;
// C-Style string for symbol-name
let c_string = "malloc\0".as_ptr() as *mut i8; // char array for libc
// Void-Pointer to address of symbol
let real_malloc_addr: *mut libc::c_void = unsafe {libc::dlsym(libc::RTLD_NEXT, c_string)};
// transmute: "Reinterprets the bits of a value of one type as another type"
// Transform void-pointer-type to callable C-Function
let real_malloc: LibCMallocT = unsafe { std::mem::transmute(real_malloc_addr) }
When the shared object is build, one can verify that it works like this:
LD_PRELOAD=./target/debug/libmalloc_log_lib.so some-binary
My full code:
extern crate libc;
use std::io::Write;
const MSG: &str = "HELLO WORLD\n";
type LibCMallocT = fn(usize) -> *mut libc::c_void;
#[no_mangle] // then "malloc" is the symbol name so that ELF-Files can find it (if this lib is preloaded)
pub extern fn malloc(bytes: usize) -> *mut libc::c_void {
/// Disable logging aka return immediately the pointer from the real malloc (libc malloc)
static mut RETURN_IMMEDIATELY: bool = false;
// C-Style string for symbol-name
let c_string = "malloc\0".as_ptr() as *mut i8; // char array for libc
// Void-Pointer to address of symbol
let real_malloc_addr: *mut libc::c_void = unsafe {libc::dlsym(libc::RTLD_NEXT, c_string)};
// transmute: "Reinterprets the bits of a value of one type as another type"
// Transform void-pointer-type to callable C-Function
let real_malloc: LibCMallocT = unsafe { std::mem::transmute(real_malloc_addr) };
unsafe {
if !RETURN_IMMEDIATELY {
// let's do logging and other stuff that potentially
// needs malloc() itself
// This Variable prevent infinite loops because 'std::io::stdout().write_all'
// also uses malloc itself
// TODO: Do proper synchronisazion
// (lock whole method? thread_local variable?)
RETURN_IMMEDIATELY = true;
match std::io::stdout().write_all(MSG.as_bytes()) {
_ => ()
};
RETURN_IMMEDIATELY = false
}
}
(real_malloc)(bytes)
}
PS: Thanks to https://stackoverflow.com/a/46134764/2891595 (After I googled a lot more I found the trick with transmute!)
Related
I've been diving into FFI with Rust, and while there are some useful posts and documentation on passing in strings and sometimes getting a string back as the return value from a C function, I don't see any way to get an out parameter. To get my feet wet, I am trying to access a C library that exposes a function that is declared as:
unsigned char *some_func(
const unsigned char *key,
unsigned int klen,
const unsigned char *src,
unsigned char *dest,
unsigned int slen);
I think it should be declared roughly as:
use libc;
extern "C" {
fn some_func(
key: *const libc::c_uchar,
klen: libc::c_uint,
src: *const libc::c_uchar,
dest: *mut libc::c_uchar,
slen: libc::c_uint) -> *mut libc::c_uchar;
}
// wrapper
pub fn wrapped_func(key: &str, src: &str) -> String {
unsafe {
let key_len = key.len() as u32;
let key = std::ffi::CString::new(key).unwrap();
let src_len = src.len() as u32;
let src = std::ffi::CString::new(src).unwrap();
let key_ptr = key.as_ptr();
let src_ptr = src.as_ptr();
let mut dest: *mut std::ffi::CStr;
let result = PC1(key_ptr, key_len, src_ptr, &mut dest, src_len);
// ^^^^^^^^^ expected `u8`, found *-ptr
key_ptr = ptr::null();
src_ptr = ptr::null();
let result_string = if let Ok(s) = result.to_str() {
String::from(s)
} else {
String::new()
};
dest = ptr::null();
result_string
}
There are a few issues I'm having with this:
I don't know how to properly declare dest as my out string. What's the right way to get an output string? From Wrapping Unsafe C Libraries in Rust, it looks like I need to treat this as a CStr:
For cases where you’re getting a const char * from a C function and want to convert it to something Rust can use, you’ll need to ensure it’s not null, then convert it into a &CStr with an appropriate lifetime. If you can’t figure out how to express an appropriate lifetime, the safest thing is to immediately convert it into an owned String!
The C code - and therefore my extern function - declares the strings as unsigned char *, but when I try to invoke, I have a type mismatch between u8 and i8. Is it acceptable to just switch the extern to be c_char, or is this dangerous?
Is setting key_ptr = ptr::null() the correct way (and all that's needed) to release the memory safely?
I would like to return some strings to C via a Rust FFI call. I also would like to ensure they're cleaned up properly.
I'm creating the strings on the Rust side and turning them into an address of an array of strings.
use core::mem;
use std::ffi::CString;
#[no_mangle]
pub extern "C" fn drop_rust_memory(mem: *mut ::libc::c_void) {
unsafe {
let boxed = Box::from_raw(mem);
mem::drop(boxed);
}
}
#[no_mangle]
pub extern "C" fn string_xfer(strings_out: *mut *mut *mut ::libc::c_char) -> usize {
unsafe {
let s1 = CString::new("String 1").unwrap();
let s2 = CString::new("String 2").unwrap();
let s1 = s1.into_raw();
let s2 = s2.into_raw();
let strs = vec![s1, s2];
let len = strs.len();
let mut boxed_slice = strs.into_boxed_slice();
*strings_out = boxed_slice.as_mut_ptr() as *mut *mut ::libc::c_char;
mem::forget(boxed_slice);
len
}
}
On the C side, I call the Rust FFI function, print the strings and then attempt to delete them via another Rust FFI call.
extern size_t string_xfer(char ***out);
extern void drop_rust_memory(void* mem);
int main() {
char **receiver;
int len = string_xfer(&receiver);
for (int i = 0; i < len; i++) {
printf("<%s>\n", receiver[i]);
}
drop_rust_memory(receiver);
printf("# rust memory dropped\n");
for (int i = 0; i < len; i++) {
printf("<%s>\n", receiver[i]);
}
return 0;
}
This appears to work. For the second printing after the drop, I would expect to get a crash or some undefined behavior, but I get this
<String 1>
<String 2>
# rust memory dropped
<(null)>
<String 2>
which makes me less sure about the entire thing.
First you may want take a look at Catching panic! when Rust called from C FFI, without spawning threads. Because panic will invoke undefined behaviour in this case. So you better catch the panic or avoid have code that can panic.
Secondly, into_boxed_slice() is primary use when you don't need vector feature any more so "A contiguous growable array type". You could also use as_mut_ptr() and forget the vector. That a choice either you want to carry the capacity information into C so you can make the vector grow or you don't want. (I think vector is missing a into_raw() method but I'm sure you can code one (just an example) to avoid critical code repetition). You could also use Box::into_raw() followed with a cast to transform the slice to pointer:
use std::panic;
use std::ffi::CString;
pub unsafe extern "C" fn string_xfer(len: &mut libc::size_t) -> Option<*mut *mut libc::c_char> {
if let Ok(slice) = panic::catch_unwind(move || {
let s1 = CString::new("String 1").unwrap();
let s2 = CString::new("String 2").unwrap();
let strs = vec![s1.into_raw(), s2.into_raw()];
Box::into_raw(strs.into_boxed_slice())
}) {
*len = (*slice).len();
Some(slice as _)
} else {
None
}
}
Third, your drop_rust_memory() only drop a pointer, I think you are doing a total UB here. Rust memory allocation need the real size of the allocation (capacity). And you didn't give the size of your slice, you tell to Rust "free this pointer that contain a pointer to nothing (void so 0)" but that not the good capacity. You need to use from_raw_parts_mut(), your C code must give the size of the slice to the Rust code. Also, you need to properly free your CString you need to call from_raw() to do it (More information here):
use std::ffi::CString;
pub unsafe extern "C" fn drop_rust_memory(mem: *mut *mut libc::c_char, len: libc::size_t) {
let slice = Box::from_raw(std::slice::from_raw_parts_mut(mem, len));
for &x in slice.iter() {
CString::from_raw(x);
} // CString will free resource don't use mem/vec element after
}
To conclude, you should read more about undefined behaviour, it's not about "expect a crash" or "something" should happen. When your program trigger a UB, everything can happen, you go into a random zone, read more about UB on this amazing LLVM blog post
Note about C style prefer return the pointer and not the size because strings_out: *mut *mut *mut ::libc::c_char is a ugly thing so do pub extern fn string_xfer(size: &mut libc::size_t) -> *mut *mut libc::c_char. Also, How to check if function pointer passed from C is non-NULL
mexPrintf, just like printf, accepts a varargs list of arguments, but I don't know what the best way to wrap this is in Rust. There is a RFC for variadic generics, but what can we do today?
In this example, I want to print of the number of inputs and outputs, but the wrapped function just prints garbage. Any idea how to fix this?
#![allow(non_snake_case)]
#![allow(unused_variables)]
extern crate mex_sys;
use mex_sys::mxArray;
use std::ffi::CString;
use std::os::raw::c_int;
use std::os::raw::c_void;
type VarArgs = *mut c_void;
// attempt to wrap mex_sys::mexPrintf
fn mexPrintf(fmt: &str, args: VarArgs) {
let cs = CString::new(fmt).unwrap();
unsafe {
mex_sys::mexPrintf(cs.as_ptr(), args);
}
}
#[no_mangle]
pub extern "system" fn mexFunction(
nlhs: c_int,
plhs: *mut *mut mxArray,
nrhs: c_int,
prhs: *mut *mut mxArray,
) {
let hw = CString::new("hello world\n").unwrap();
unsafe {
mex_sys::mexPrintf(hw.as_ptr());
}
let inout = CString::new("%d inputs and %d outputs\n").unwrap();
unsafe {
mex_sys::mexPrintf(inout.as_ptr(), nrhs, nlhs);
}
mexPrintf("hello world wrapped\n", std::ptr::null_mut());
let n = Box::new(nrhs);
let p = Box::into_raw(n);
mexPrintf("inputs %d\n", p as VarArgs);
let mut v = vec![3];
mexPrintf("vec %d\n", v.as_mut_ptr() as VarArgs);
}
Contrary to popular belief, it is possible to call variadic / vararg functions that were defined in C. That doesn't mean that doing so is very easy, and it's definitely even easier to do something bad because there are even fewer types for the compiler to check your work with.
Here's an example of calling printf. I've hard-coded just about everything:
extern crate libc;
fn my_thing() {
unsafe {
libc::printf(b"Hello, %s (%d)\0".as_ptr() as *const i8, b"world\0".as_ptr(), 42i32);
}
}
fn main() {
my_thing()
}
Note that I have to very explicitly make sure my format string and arguments are all the right types and the strings are NUL-terminated.
Normally, you'll use tools like CString:
extern crate libc;
use std::ffi::CString;
fn my_thing(name: &str, number: i32) {
let fmt = CString::new("Hello, %s (%d)").expect("Invalid format string");
let name = CString::new(name).expect("Invalid name");
unsafe {
libc::printf(fmt.as_ptr(), name.as_ptr(), number);
}
}
fn main() {
my_thing("world", 42)
}
The Rust compiler test suite also has an example of calling a variadic function.
A word of warning specifically for printf-like functions: C compiler-writers realized that people screw up this particular type of variadic function call all the time. To help combat that, they've encoded special logic that parses the format string and attempts to check the argument types against the types the format string expect. The Rust compiler will not check your C-style format strings for you!
I had confused a "variable list of arguments" with a va_list. I'm going to avoid both if I can and in this situation, I'm just going to do the string formatting in Rust before passing it to interop. Here is what worked for me in this case:
#![allow(non_snake_case)]
#![allow(unused_variables)]
extern crate mex_sys;
use mex_sys::mxArray;
use std::ffi::CString;
use std::os::raw::c_int;
// attempt to wrap mex_sys::mexPrintf
fn mexPrintf(text: &str) {
let cs = CString::new(text).expect("Invalid text");
unsafe {
mex_sys::mexPrintf(cs.as_ptr());
}
}
#[no_mangle]
pub extern "C" fn mexFunction(
nlhs: c_int,
plhs: *mut *mut mxArray,
nrhs: c_int,
prhs: *mut *mut mxArray,
) {
mexPrintf(&format!("{} inputs and {} outputs\n", nrhs, nlhs));
}
I'm trying to write a readline custom completer (tab completion) in Rust. I think I have everything straight, but when I try it en vivo it heads off into the weeds and never comes back. Oddly, when I call it directly from main() I appear to get a valid pointer back. I never see a crash or panic in either case. Backtrace output is not consistent over runs (it's busy doing something). Perhaps one clue is that gdb indicates that the arguments passed to the completer are incorrect (although I'm not actually using them). Eg, after callback:
#2 0x00007f141f701272 in readlinetest::complete (text=0x7f141ff27d10 "", start=2704437, end=499122176) at src/main.rs:24
Or directly, breakpointing the call in main:
#0 readlinetest::complete (text=0x555555559190 <complete::hcda8d6cb2ef52a1bKaa> "dH;$%p", start=0, end=0) at src/main.rs:21
Do I have an ABI problem? Seems unlikely and the function signature isn't complicated :(
Here is a small test project: Cargo.toml:
[package]
name = "readlinetest"
version = "0.1.0"
authors = ["You <you#example.com>"]
[dependencies]
libc = "*"
readline = "*"
And main.rs:
extern crate libc;
extern crate readline;
use libc::{c_char, c_int};
use std::ffi::CString;
use std::process::exit;
use std::ptr;
use std::str;
extern { fn puts(s: *const libc::c_char); }
#[link(name = "readline")]
// Define the global in libreadline that will point to our completion function.
extern {
static mut rl_completion_entry_function: extern fn(text: *const c_char,
start: c_int,
end: c_int) -> *const *const c_char;
}
// Our completion function. Returns two strings.
extern fn complete(text: *const c_char, start: c_int, end: c_int) -> *const *const c_char {
let _ = text; let _ = start; let _ = end;
let mut words:Vec<*const c_char> =
vec!(CString::new("one").unwrap(), CString::new("two").unwrap()).
iter().
map(|arg| arg.as_ptr()).
collect();
words.push(ptr::null()); // append null
words.as_ptr() as *const *const c_char
}
fn main() {
let words = complete(string_to_mut_c_char("hi"), 1, 2);
unsafe { puts(*words) } // prints "one"
//unsafe { puts((*words + ?)) } // not sure hot to get to "two"
unsafe { rl_completion_entry_function = complete }
// Loop until EOF: echo input to stdout
loop {
if let Ok(input) = readline::readline_bare(&CString::new("> ").unwrap()) {
let text = str::from_utf8(&input.to_bytes()).unwrap();
println!("{}", text);
} else { // EOF/^D
exit(0)
}
}
}
// Just for testing
fn string_to_mut_c_char(s: &str) -> *mut c_char {
let mut bytes = s.to_string().into_bytes(); // Vec<u8>
bytes.push(0); // terminating null
let mut cchars = bytes.iter().map(|b| *b as c_char).collect::<Vec<c_char>>();
let name: *mut c_char = cchars.as_mut_ptr();
name
}
Ubuntu 14.04, 64 bit with Rust 1.3.
What am I missing? Thanks for any pointers (ha ha...).
and the function signature isn't complicated
It's not, but it does help to have the right one... ^_^ From my local version of readline (6.3.8):
extern rl_compentry_func_t *rl_completion_entry_function;
typedef char *rl_compentry_func_t PARAMS((const char *, int));
Additionally, you have multiple use after free errors:
vec![CString::new("one").unwrap()].iter().map(|s| s.as_ptr());
This creates a CString and gets the pointer to it. When the statement is done, nothing owns the vector that owns the strings. The vector will be immediately dropped, dropping the strings, invalidating the pointers.
words.as_ptr() as *const *const c_char
Similar thing here — you take the pointer, but then nothing owns the words vector anymore, so it is dropped, invalidating that pointer. So now you have an invalid pointer which attempts to point to a sequence of invalid pointers.
The same problem can be found in string_to_mut_c_char.
I don't know enough readline to understand who is supposed to own the returned strings, but it looks like you pass ownership to readline and it frees them. If so, that means you are going to have to use the same allocator that readline does so that it can free the strings for you. You will likely have to write some custom code that copies a CString's data using the appropriate allocator.
Style-wise, you can use underscores in variable names to indicate they are unused:
extern fn complete(_text: *const c_char, _start: c_int, _end: c_int)
There should be a space after : and there's no need to specify the type of the vector's contents:
let mut words: Vec<_>
This question already has answers here:
Creating a static C struct containing strings
(3 answers)
Closed 7 years ago.
I'm trying to learn Rust (newbie in low level programming), and want to translate a tiny lv2 amplifier (audio) plugin "amp.c" (C-code) from C to Rust. I actually got it working (here), but when the host terminates, valgrind says that "
64 bytes in 1 blocks are definitely lost". I think I know why this happens, but I don't know how to fix it.
Before you get tired of reading, here is the final question:
How do I statically allocate a struct that contains a C string?
And here is the introduction:
Why it happens (I think):
Host loads the library and calls lv2_descriptor()
const LV2_Descriptor*
lv2_descriptor()
{
return &descriptor;
}
which returns a pointer to a STATICALLY allocated struct of type LV2_Descriptor,
static const LV2_Descriptor descriptor = {
AMP_URI,
...
};
which is defined as
typedef struct _LV2_Descriptor {
const char * URI;
...
} LV2_Descriptor;
Why is it statically allocated? In the amp.c it says:
It is best to define descriptors statically to avoid leaking memory
and non-portable shared library constructors and destructors to clean
up properly.
However, I translated lv2_descriptor() to Rust as:
#[no_mangle]
pub extern fn lv2_descriptor(index:i32) -> *const LV2Descriptor {
let s = "http://example.org/eg-amp_rust";
let cstr = CString::new(s).unwrap();
let ptr = cstr.as_ptr();
mem::forget(cstr);
let mybox = Box::new(LV2Descriptor{amp_uri: ptr}, ...);
let bxptr = &*mybox as *const LV2Descriptor;
mem::forget(mybox);
return bxptr
}
So it's not statically allocated and I never free it, that's I guess why valgrind complains?
How am I trying to solve it?
I'm trying to do the same thing in Rust as the C-code does, i.e. statically allocate the struct (outside of lv2_descriptor()). The goal is to be fully compatible to the lv2 library, i.e "...to avoid leaking memory..." etc., as it says in the quote, right? So I tried something like:
static ptr1: *const u8 = (b"example.org/eg-amp_rust\n\0").as_ptr();
static ptr2: *const libc::c_char = ptr1 as *const libc::c_char;
static desc: LV2Descriptor = LV2Descriptor{amp_uri: ptr2, ...};
But this does not compile, there are error messages like
src/lib.rs:184:26: 184:72 error: the trait `core::marker::Sync` is not implemented for the type `*const u8` [E0277]
src/lib.rs:184 static ptr1: *const u8 = b"http://example.org/eg-amp_rust\n\0".as_ptr();
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
src/lib.rs:184:26: 184:72 note: `*const u8` cannot be shared between threads safely
src/lib.rs:184 static ptr1: *const u8 = b"http://example.org/eg-amp_rust\n\0".as_ptr();
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
src/lib.rs:184:26: 184:72 error: static contains unimplemented expression type [E0019]
src/lib.rs:184 static ptr1: *const u8 = b"http://example.org/eg-amp_rust\n\0".as_ptr();
Specific problem/question:
How do I statically allocate a struct that contains a C string?
The short answer is, you don't for now. Future Rust will probably gain this ability.
What you can do, is statically allocate a struct that contains null pointers, and set those null pointers to something useful when you call the function. Rust has static mut. It requires unsafe code, is not threadsafe at all and is (to the best of my knowledge) considered a code smell.
Right here I consider it a workaround to the fact that there is no way to turn a &[T] into a *const T in a static.
static S: &'static [u8] = b"http://example.org/eg-amp_rust\n\0";
static mut desc: LV2Descriptor = LV2Descriptor {
amp_uri: 0 as *const libc::c_char, // ptr::null() isn't const fn (yet)
};
#[no_mangle]
pub extern fn lv2_descriptor(index: i32) -> *const LV2Descriptor {
let ptr = S.as_ptr() as *const libc::c_char;
unsafe {
desc.amp_uri = ptr;
&desc as *const LV2Descriptor
}
}