How can I pass a pointer to itself to C++ so it can perform a callback? - rust

I've defined a way to pass my Rust object to C++ so it can call this Rust object back:
extern "C" {
pub fn zlmedia_set_parent(zl_media: *mut ZLInstance, parent: Box<ZLMedia>);
}
pub fn new_boxed(url: &str) -> Box<ZLMedia> {
let c_url = CString::new(url).expect("CString::new failed");
let p = Box::new(ZLMedia {
zl_media: unsafe { zlmedia_new(c_url.into_raw()) },
url: url.to_string(),
});
unsafe {
zlmedia_set_parent(p.as_ref().zl_media, p);
}
p
}
The problem is that I can't pass the Box to C++ because I move it here:
unsafe {
zlmedia_set_parent(p.as_ref().zl_media, p);
}
How can I pass a Box to itself to C++ and still return it in the new? I don't really need to move it, because it goes to a C++ function.

If you want the ZLMedia to be owned by the Rust code then you can pass a pointer to it rather then the box itself.
use std::borrow::Borrow as _;
extern "C" {
pub fn zlmedia_set_parent(zl_media: *mut ZLInstance, parent: *const ZLMedia);
}
pub fn new_boxed(url: &str) -> Box<ZLMedia> {
let c_url = CString::new(url).expect("CString::new failed");
let p = Box::new(ZLMedia {
zl_media: unsafe { zlmedia_new(c_url.into_raw()) },
url: url.to_string(),
});
unsafe {
zlmedia_set_parent(p.as_ref().zl_media, p.borrow() as *const ZLMedia);
}
p
}
If the C++ code dereferences the pointer after the box is dropped, it will trigger Undefined Behaviour. If the C++ code tries to free the pointer then it's also UB. It's up to you to make sure that these things don't happen.

Related

Is there a way to pass an extra value to the panic handler?

I'm writing a UEFI OS loader, and I use the system table provided by efi_main in the panic handler to print a string on the console. Currently, I'm using a global static variable and a helper function to access it like this:
static SYSTEM_TABLE_WRAPPER: Lazy<Spinlock<Option<SystemTable>>> =
Lazy::new(|| Spinlock::new(None));
#[panic_handler]
fn panic(i: &PanicInfo<'_>) -> ! {
// SAFETY: The existing lock is forgotten. There is no way to access the lock from the panic
// handler.
unsafe { unlock_system_table() }
error!("{}", i);
loop {
x86_64::instructions::hlt();
}
}
pub fn _print(args: fmt::Arguments<'_>) {
let mut st = crate::system_table();
let mut stdout = st.con_out();
let _ = stdout.write_fmt(args);
}
#[macro_export]
macro_rules! println {
() => {
$crate::print!("\n");
};
($($arg:tt)*)=>{
$crate::print!("{}\n",core::format_args!($($arg)*));
}
}
#[macro_export]
macro_rules! print {
($($arg:tt)*) => {
$crate::io::_print(core::format_args!($($arg)*));
};
}
pub(crate) fn system_table<'a>() -> MappedSpinlockGuard<'a, uefi_wrapper::SystemTable> {
let st = SYSTEM_TABLE_WRAPPER.try_lock();
let st = st.expect("Failed to lock the global System Table.");
SpinlockGuard::map(st, |st| {
let st = st.as_mut();
let st = st.expect("The global System Table is not initialized.");
&mut st.0
})
}
Although this works correctly, I'd like to avoid using any global variables if possible. Is there a way to do that?
I dont think so. If its possible, any parameter would be a global as well. Making it more complex.
Global variables are ok for this. Create your own global panic object and give it to a new panic handler from the real one.
Because there seems to be no way to do that, I changed the way.
Firstly, I defined the uefi_print and uefi_println macros.
#[macro_export]
macro_rules! uefi_print{
($st:expr,$($arg:tt)*)=>{
$crate::io::_print($st,format_args!($($arg)*));
}
}
#[macro_export]
macro_rules! uefi_println {
($st:expr) => {
$crate::uefi_print!($st,"\n");
};
($st:expr,$($arg:tt)*)=>{
$crate::uefi_print!($st,"{}\n",format_args!($($arg)*));
}
}
#[doc(hidden)]
pub fn _print(st: &mut crate::SystemTable, args: fmt::Arguments<'_>) {
let mut con_out = st.con_out();
let _ = con_out.write_fmt(args);
}
These macros are similar to print and println macros, but the first parameter is the mutable reference to SystemTable. The example macro invocation:
#[no_mangle]
pub extern "win64" fn efi_main(h: uefi_wrapper::Handle, mut st: bootx64::SystemTable) -> ! {
let resolution = gop::set_preferred_resolution(&mut st);
uefi_println!(&mut st, "GOP info: {:?}", resolution,);
// ...
}
Secondly, I defined the uefi_panic macro:
#[macro_export]
macro_rules! uefi_panic {
($st:expr) => {
$crate::uefi_panic!($st, "explicit panic");
};
($st:expr,$($t:tt)+)=>{
$crate::uefi_println!($st,"panicked at '{}', {}:{}:{}",core::format_args!($($t)+),file!(),line!(),column!());
panic!();
}
}
#[panic_handler]
fn panic(_: &PanicInfo<'_>) -> ! {
loop {
x86_64::instructions::hlt();
}
}
The first parameter of the macro is also the mutable reference to SystemTable.
Thirdly, I defined a wrapper type for SystemTable that defines the additional method expect_ok.
#[repr(transparent)]
#[derive(Debug)]
pub struct SystemTable(uefi_wrapper::SystemTable);
impl SystemTable {
// ...
/// # Panics
///
/// This method panics if `result` is [`Err`].
pub fn expect_ok<T, E: fmt::Debug>(&mut self, result: Result<T, E>, msg: &str) -> T {
match result {
Ok(val) => val,
Err(e) => {
uefi_panic!(self, "{}: {:?}", msg, e);
}
}
}
}
Now I prefer this way to the previous method. The exit boot services function can move the ownership of SystemTable and ImageHandle. This way prevents these types from being used after calling it. After all, I should not create a global static SystemTable since it won't be valid after exiting the boot services.

If an ffi function modifies a pointer, should the owning struct be referenced mutable?

I am currently experimenting with the FFI functionality of Rust and implemented a simble HTTP request using libcurl as an exercise. Consider the following self-contained example:
use std::ffi::c_void;
#[repr(C)]
struct CURL {
_private: [u8; 0],
}
// Global CURL codes
const CURL_GLOBAL_DEFAULT: i64 = 3;
const CURLOPT_WRITEDATA: i64 = 10001;
const CURLOPT_URL: i64 = 10002;
const CURLOPT_WRITEFUNCTION: i64 = 20011;
// Curl types
type CURLcode = i64;
type CURLoption = i64;
// Curl function bindings
#[link(name = "curl")]
extern "C" {
fn curl_easy_init() -> *mut CURL;
fn curl_easy_setopt(handle: *mut CURL, option: CURLoption, value: *mut c_void) -> CURLcode;
fn curl_easy_perform(handle: *mut CURL) -> CURLcode;
fn curl_global_init(flags: i64) -> CURLcode;
}
// Curl callback for data retrieving
extern "C" fn callback_writefunction(
data: *mut u8,
size: usize,
nmemb: usize,
user_data: *mut c_void,
) -> usize {
let slice = unsafe { std::slice::from_raw_parts(data, size * nmemb) };
let mut vec = unsafe { Box::from_raw(user_data as *mut Vec<u8>) };
vec.extend_from_slice(slice);
Box::into_raw(vec);
nmemb * size
}
type Result<T> = std::result::Result<T, CURLcode>;
// Our own curl handle
pub struct Curl {
handle: *mut CURL,
data_ptr: *mut Vec<u8>,
}
impl Curl {
pub fn new() -> std::result::Result<Curl, CURLcode> {
let ret = unsafe { curl_global_init(CURL_GLOBAL_DEFAULT) };
if ret != 0 {
return Err(ret);
}
let handle = unsafe { curl_easy_init() };
if handle.is_null() {
return Err(2); // CURLE_FAILED_INIT according to libcurl-errors(3)
}
// Set data callback
let ret = unsafe {
curl_easy_setopt(
handle,
CURLOPT_WRITEFUNCTION,
callback_writefunction as *mut c_void,
)
};
if ret != 0 {
return Err(2);
}
// Set data pointer
let data_buf = Box::new(Vec::new());
let data_ptr = Box::into_raw(data_buf);
let ret = unsafe {
curl_easy_setopt(handle, CURLOPT_WRITEDATA, data_ptr as *mut std::ffi::c_void)
};
match ret {
0 => Ok(Curl { handle, data_ptr }),
_ => Err(2),
}
}
pub fn set_url(&self, url: &str) -> Result<()> {
let url_cstr = std::ffi::CString::new(url.as_bytes()).unwrap();
let ret = unsafe {
curl_easy_setopt(
self.handle,
CURLOPT_URL,
url_cstr.as_ptr() as *mut std::ffi::c_void,
)
};
match ret {
0 => Ok(()),
x => Err(x),
}
}
pub fn perform(&self) -> Result<String> {
let ret = unsafe { curl_easy_perform(self.handle) };
if ret == 0 {
let b = unsafe { Box::from_raw(self.data_ptr) };
let data = (*b).clone();
Box::into_raw(b);
Ok(String::from_utf8(data).unwrap())
} else {
Err(ret)
}
}
}
fn main() -> Result<()> {
let my_curl = Curl::new().unwrap();
my_curl.set_url("https://www.example.com")?;
my_curl.perform().and_then(|data| Ok(println!("{}", data)))
// No cleanup code in this example for the sake of brevity.
}
While this works, I found it surprising that my_curl does not need to be declared mut, since none of the methods use &mut self, even though they pass a mut* pointer to the FFI function
s.
Should I change the declaration of perform to use &mut self instead of &self (for safety), since the internal buffer gets modified? Rust does not enforce this, but of course Rust does not know that the buffer gets modified by libcurl.
This small example runs fine, but I am unsure if I would be facing any kind of issues in larger programs, when the compiler might optimize for non-mutable access on the Curl struct, even though the instance of the struct is getting modified - or at least the data the pointer is pointing to.
Contrary to popular belief, there is absolutely no borrowchecker-induced restriction in Rust on passing *const/*mut pointers. There doesn't need to be, because dereferencing pointers is inherently unsafe, and can only be done in such blocks, with the programmer verifying all necessary invariants manually. In your case, you need to tell the compiler that is a mutable reference, as you already suspected.
The interested reader should definitely give the ffi section of the nomicon a read, to find out about some interesting ways to shoot yourself in the foot with it.

How to return a reference to a global vector or an internal Option?

I'm trying to create a method that can return a reference to Data that is either in a constant global array or inside an Option in an item. The lifetimes are certainly different, but it's safe to assume that the lifetime of the data is at least as long as the lifetime of the item. While doing this, I expected the compiler to warn if I did anything wrong, but it's instead generating wrong instructions and the program is crashing with SIGILL.
Concretely speaking, I have the following code failing in Rust 1.27.2:
#[derive(Debug)]
pub enum Type {
TYPE1,
TYPE2,
}
#[derive(Debug)]
pub struct Data {
pub ctype: Type,
pub int: i32,
}
#[derive(Debug)]
pub struct Entity {
pub idata: usize,
pub modifier: Option<Data>,
}
impl Entity {
pub fn data(&self) -> &Data {
if self.modifier.is_none() {
&DATA[self.idata]
} else {
self.modifier.as_ref().unwrap()
}
}
}
pub const DATA: [Data; 1] = [Data {
ctype: Type::TYPE2,
int: 1,
}];
fn main() {
let mut itemvec = vec![Entity {
idata: 0,
modifier: None,
}];
eprintln!("vec[0]: {:p} = {:?}", &itemvec[0], itemvec[0]);
eprintln!("removed item 0");
let item = itemvec.remove(0);
eprintln!("item: {:p} = {:?}", &item, item);
eprintln!("modifier: {:p} = {:?}", &item.modifier, item.modifier);
eprintln!("DATA: {:p} = {:?}", &DATA[0], DATA[0]);
let itemdata = item.data();
eprintln!("itemdata: {:p} = {:?}", itemdata, itemdata);
}
Complete code
I can't understand what I'm doing wrong. Why isn't the compiler generating a warning? Is it the removal of the (non-copy) item of the vector? Is it the ambiguous lifetimes?
How to return a reference to a global vector or an internal Option?
By using Option::unwrap_or_else:
impl Entity {
pub fn data(&self) -> &Data {
self.modifier.as_ref().unwrap_or_else(|| &DATA[self.idata])
}
}
but it's instead generating wrong instructions and the program is crashing with SIGILL
The code in your question does not have this behavior on macOS with Rust 1.27.2 or 1.28.0. On Ubuntu I see an issue when running the program in Valgrind, but the problem goes away in Rust 1.28.0.
See also:
Why should I prefer `Option::ok_or_else` instead of `Option::ok_or`?
What is this unwrap thing: sometimes it's unwrap sometimes it's unwrap_or

Rust "if let" not working?

I am wrapping libxml2 in Rust as an exercise in learning the Rust FFI, and I have come across something strange. I am new to Rust, but I believe the following should work.
In main.rs I have:
mod xml;
fn main() {
if let doc = xml::parse_file("filename") {
doc.simple_function();
}
}
And xml.rs is:
extern create libc;
use libc::{c_void, c_char, c_int, c_ushort};
use std::ffi::CString;
// There are other struct definitions I've implemented that xmlDoc
// depends on, but I'm not going to list them because I believe
// they're irrelevant
#[allow(non_snake_case)]
#[allow(non_camel_case_types)]
#[repr(C)]
struct xmlDoc {
// xmlDoc structure defined by the libxml2 API
}
pub struct Doc {
ptr: *mut xmlDoc
}
impl Doc {
pub fn simple_function(&self) {
if self.ptr.is_null() {
println!("ptr doesn't point to anything");
} else {
println!("ptr is not null");
}
}
}
#[allow(non_snake_case)]
#[link(name = "xml2")]
extern {
fn xmlParseFile(filename: *const c_char) -> *mut xmlDoc;
}
pub fn parse_file(filename: &str) -> Option<Doc> {
unsafe {
let result;
match CString::new(filename) {
Ok(f) => { result = xmlParseFile(f.as_ptr()); },
Err(_) => { return None; }
}
if result.is_null() {
return None;
}
Some(Doc { ptr: result })
}
}
I'm wrapping the C struct, xmlDoc in a nice Rust-friendly struct, Doc, to have a clear delineation between the safe (Rust) and unsafe (C) data types and functions.
This all works for the most part except when I compile, I get an error in main.rs:
src/main.rs:38:13: 38:28 error: no method named 'simple_function' found
for type 'std::option::Option<xml::Doc>' in the current scope
src/main.rs:38 doc.simple_function();
^~~~~~~~~~~~~~~
error: aborting due to previous error`
It seems convinced that doc is an Option<xml::Doc> even though I'm using the if let form that should unwrap the Option type. Is there something I'm doing incorrectly?
match xml::parse_file("filename") {
Some(doc) => doc.simple_function(),
None => {}
}
The above works fine, but I'd like to use the if let feature of Rust if I'm able.
You need to pass the actual pattern to if let (unlike languages like Swift which special case if let for Option types):
if let Some(doc) = xml::parse_file("filename") {
doc.simple_function();
}

Threaded calling of functions in a vector

I have an EventRegistry which people can use to register event listeners. It then calls the appropriate listeners when an event is broadcast. But, when I try to multithread it, it doesn't compile. How would I get this code working?
use std::collections::HashMap;
use std::thread;
struct EventRegistry<'a> {
event_listeners: HashMap<&'a str, Vec<Box<Fn() + Sync>>>
}
impl<'a> EventRegistry<'a> {
fn new() -> EventRegistry<'a> {
EventRegistry {
event_listeners: HashMap::new()
}
}
fn add_event_listener(&mut self, event: &'a str, listener: Box<Fn() + Sync>) {
match self.event_listeners.get_mut(event) {
Some(listeners) => {
listeners.push(listener);
return
},
None => {}
};
let mut listeners = Vec::with_capacity(1);
listeners.push(listener);
self.event_listeners.insert(event, listeners);
}
fn broadcast_event(&mut self, event: &str) {
match self.event_listeners.get(event) {
Some(listeners) => {
for listener in listeners.iter() {
let _ = thread::spawn(|| {
listener();
});
}
}
None => {}
}
}
}
fn main() {
let mut main_registry = EventRegistry::new();
main_registry.add_event_listener("player_move", Box::new(|| {
println!("Hey, look, the player moved!");
}));
main_registry.broadcast_event("player_move");
}
Playpen (not sure if it's minimal, but it produces the error)
If I use thread::scoped, it works too, but that's unstable, and I think it only works because it immediately joins back to the main thread.
Updated question
I meant "call them in their own thread"
The easiest thing to do is avoid the Fn* traits, if possible. If you know that you are only using full functions, then it's straightforward:
use std::thread;
fn a() { println!("a"); }
fn b() { println!("b"); }
fn main() {
let fns = vec![a as fn(), b as fn()];
for &f in &fns {
thread::spawn(move || f());
}
thread::sleep_ms(500);
}
If you can't use that for some reason (like you want to accept closures), then you will need to be a bit more explicit and use Arc:
use std::thread;
use std::sync::Arc;
fn a() { println!("a"); }
fn b() { println!("b"); }
fn main() {
let fns = vec![
Arc::new(Box::new(a) as Box<Fn() + Send + Sync>),
Arc::new(Box::new(b) as Box<Fn() + Send + Sync>),
];
for f in &fns {
let my_f = f.clone();
thread::spawn(move || my_f());
}
thread::sleep_ms(500);
}
Here, we can create a reference-counted trait object. We can clone the trait object (increasing the reference count) each time we spawn a new thread. Each thread gets its own reference to the trait object.
If I use thread::scoped, it works too
thread::scoped is pretty awesome; it's really unfortunate that it needed to be marked unstable due to some complex interactions that weren't the best.
One of the benefits of a scoped thread is that the thread is guaranteed to end by a specific time: when the JoinGuard is dropped. That means that scoped threads are allowed to contain non-'static references, so long as those references last longer than the thread!
A spawned thread has no such guarantees about how long they live; these threads may live "forever". Any references they take must also live "forever", thus the 'static restriction.
This serves to explain your original problem. You have a vector with a non-'static lifetime, but you are handing references that point into that vector to the thread. If the vector were to be deallocated before the thread exited, you could attempt to access undefined memory, which leads to crashes in C or C++ programs. This is Rust helping you out!
Original question
Call functions in vector without consuming them
The answer is that you just call them:
fn a() { println!("a"); }
fn b() { println!("b"); }
fn main() {
let fns = vec![Box::new(a) as Box<Fn()>, Box::new(b) as Box<Fn()>];
fns[0]();
fns[1]();
fns[0]();
fns[1]();
}
Playpen

Resources