Wrapping a Rust struct in a C++ class

Wrapping a Rust struct in a C++ class - rust

I would like to wrap a Rust struct in a C++ class.
Rust:
#[repr(C)]
pub struct RustStruct {
num: i32,
// other members..
}
pub extern "C" fn update(rust_struct: *mut RustStruct) {
(*rust_struct).num = 1i32;
}
extern "C" {
void update(void*);
}
C++:
class Wrapper {
public:
Wrapper();
// ..
private:
void* rustStruct;
// ..
};
Wrapper::Wrapper() {
update(rustStruct); // crash
}
int main() {
std::cout << "Testing..";
}
I understand why this wouldn't work. My question is: how can I achieve what I'm basically trying to do (wrap a rust struct in a c++ class)?

There is a mix of multiple FFIs concepts in your answer, so first let me recommend that your read the Reference.
There are two ways to achieve what you wish, you can either:
use a POD struct (Plain Old Data), aka C-compatible struct
use an opaque pointer (void* in C)
Mixing them, as you did, does not make sense.
Which to pick?
Both solutions have advantages and disadvantages, it's basically an expressiveness versus performance trade-off.
On the one hand, opaque pointers are more expressive: they can point to any Rust type. However:
they require dynamic memory allocation
they require being manipulated by Rust functions (so always indirectly from C or C++)
On the other hand, POD struct do not require either of those, but they are limited to only a subset of types expressible in Rust.
How to use a POD?
This is the easiest, actually, so let's start with it!
In Rust:
#[repr(C)]
pub struct RustStruct {
num: i32,
// other members, also PODs!
}
In C++
struct RustStruct {
int32_t num;
// other members, also with Standard Layout
// http://en.cppreference.com/w/cpp/types/is_standard_layout
};
class Wrapper {
public:
private:
RustStruct rustStruct;
};
Note that I just got along with your question stricto censu here, you could actually merge the two in a single C++ class:
class RustStruct {
public:
private:
int32_t num;
// other members, also with Standard Layout
// http://en.cppreference.com/w/cpp/types/is_standard_layout
};
Just avoid virtual methods.
How to use an opaque pointer?
This gets trickier:
Only the Rust code may correctly create/copy/destruct the type
Beware of leaking...
So, we need to implement a lot of functions in Rust:
#![feature(box_raw, box_syntax)]
use std::boxed;
pub struct RustStruct {
num: i32,
// other members, anything goes
}
pub extern "C" fn createRustStruct() -> *mut RustStruct {
boxed::into_raw(box RustStruct::new())
}
pub extern "C" fn destroyRustStruct(o: *mut RustStruct) {
boxed::from_raw(o);
}
Alright... now on to C++:
struct RustStruct;
RustStruct* createRustStruct();
void destroyRustStruct(RustStruct*);
class Wrapper {
public:
Wrapper(): rustStruct(RustStructPtr(createRustStruct())) {}
private:
struct Deleter {
void operator()(RustStruct* rs) const {
destroyRustStruct(rs);
}
};
typedef std::unique_ptr<RustStruct, Deleter> RustStructPtr;
RustStructPtr rustStruct;
}; // class Wrapper
So, yes, a bit more involved, and Wrapper is not copyable either (copy has to be delegated to Rust too). Anyway, this should get you started!
Note: if you have a lot of opaque pointers to wrap, a templated C++ class taking the copy/destroy functions as template parameters could alleviate a lot of boiler plate.

Related

JS style static property for Rust struct

In javascript we can use the static keyword to define a static method or property for a class. Neither static methods nor static properties can be called on instances of the class. Instead, they're called on the class itself. So, for instance, we could count the number of instances of a certain class we've created:
class Player{
static playerCount = 0;
constructor(){
Player.playerCount ++;
}
}
Is their anything in rust that would roughly equivalent to this? Or possibly a library/macro that allows for something similar?
struct Player{}
impl Player{
static playerCount;
pub fn new()->Self{
playerCount ++
//increment playerCount
}
}

There isn't such a thing as a static field in a Rust struct. Rust's structs are closer to their C namesake than the classes you'd find in object-oriented languages.
The rough equivalent is probably a static mut variable, which requires the use of unsafe code. This is because a globally accessible, mutable value is going to be a source of undefined behavior (such as data races) and Rust wants all code to be free of UB unless you write the word unsafe.
For your use case, perhaps a PlayerManager struct which holds this state and is passed by &mut reference to the new function is a good idea.
struct PlayerManager {
count: usize,
}
impl PlayerManager {
pub fn add_player(&mut self) {
self.count += 1;
}
}
// then...
impl Player {
pub fn new(manager: &mut PlayerManager) -> Self {
manager.add_player();
// TODO: the rest of player construction
}
}

Heterogenous containers in Rust for a graph

I am a C++ programmer learning Rust, and one of my main use cases is a graph-based computation engine. In my graph I have store a homogeneous type, and then I derive from this with a more specific type e.g. in C++
class BaseNode {
public:
BaseNode(std::vector<std::shared_ptr<BaseNode>>& parents);
virtual ~BaseNode() = default;
virtual void update();
const std::vector<std::shared_ptr<BaseNode>>& parents() const;
...
};
template<typename T>
class TypedNode<T> : public BaseNode {
public:
const T& value() const { return value_; }
...
private:
T value_;
}
The idea is that the graph is traversed and update() is called on each node. The node knows what each of its parents "true type" is and so in its update() can do something like static_cast<TypedNode<DataBlob>>(parents()[0]).
How do I achieve something like this in Rust?
I thought about having a design like this:
trait BaseNode {
fn parents(&self) -> &Vec<dyn BaseNode>;
}
trait TypedNode<T>: BaseNode {
fn value(&self) -> &T;
}
But I read that I won't be able to cast the "trait object" from a BaseNode into a TypedNode<T>. (Or can I do it somehow using unsafe?). The other alternative I thought would be to have a struct that stores the data in Any and then to cast that, but does that incur some runtime cost?

If all node's parents have the same type then you can use that approach:
trait BaseNode {
type Parent: BaseNode;
fn parents(&self) -> &[Self::Parent];
}
trait TypedNode<P: BaseNode>: BaseNode<Parent = P> {
type ValueType;
fn value(&self) -> &Self::ValueType;
}
Rust playground
I'm not sure if I understand your question. Please let me know if it doesn't work for you.

How do I use cbindgen to return and free a Box<Vec<_>>?

I have a struct returned to C code from Rust. I have no idea if it's a good way to do things, but it does work for rebuilding the struct and freeing memory without leaks.
#[repr(C)]
pub struct s {
// ...
}
#[repr(C)]
#[allow(clippy::box_vec)]
pub struct s_arr {
arr: *const s,
n: i8,
vec: Box<Vec<s>>,
}
/// Frees memory that was returned to C code
pub unsafe extern "C" fn free_s_arr(a: *mut s_arr) {
Box::from_raw(s_arr);
}
/// Generates an array for the C code
pub unsafe extern "C" fn gen_s_arr() -> *mut s_arr {
let many_s: Vec<s> = Vec::new();
// ... logic here
Box::into_raw(Box::new(s_arr {
arr: many_s.as_mut_ptr(),
n: many_s.len() as i8,
vec: many_s,
}))
}
The C header is currently written by hand, but I wanted to try out cbindgen. The manual C definition for s_arr is:
struct s_arr {
struct s *arr;
int8_t n;
void *_;
};
cbindgen generates the following for s_arr:
typedef struct Box_Vec_s Box_Vec_s;
typedef struct s_arr {
const s *arr;
int8_t n;
Box_Vec_s vec;
} s_arr;
This doesn't work since struct Box_Vec_s is not defined. Ideally I would just want to override the cbindgen type generated for vec to make it void * since it requires no code changes and thus no additional testing, but I am open to other suggestions.
I have looked through the cbindgen documentation, though not the examples, and couldn't find anything.

Your question is a bit unclear, but I think that if I understood you right, you're confusing two things and being led down a dark alley as a result.
In C, a dynamically-sized array, as you probably know, is identified by two things:
Its starting position, as a pointer
Its length
Rust follows the same convention - a Vec<_>, below the hood, shares the same structure (well, almost. It has a capacity as well, but that's beside the point).
Passing the boxed vector on top of a pointer is not only overkill, but extremely unwise. FFI bindings may be smart, but they're not smart enough to deal with a boxed complex type most of the time.
To solve this, we're going to simplify your bindings. I've added a single element in struct S to show you how it works. I've also cleaned up your FFI boundary:
#[repr(C)]
#[no_mangle]
pub struct S {
foo: u8
}
#[repr(C)]
pub struct s_arr {
arr: *mut S,
n: usize,
cap: usize
}
// Retrieve the vector back
pub unsafe extern "C" fn recombine_s_arr(ptr: *mut S, n: usize, cap: usize) -> Vec<S> {
Vec::from_raw_parts(ptr, n, cap)
}
#[no_mangle]
pub unsafe extern "C" fn gen_s_arr() -> s_arr {
let mut many_s: Vec<S> = Vec::new();
let output = s_arr {
arr: many_s.as_mut_ptr(),
n: many_s.len(),
cap: many_s.capacity()
};
std::mem::forget(many_s);
output
}
With this, cbindgen returns the expected header definitions:
typedef struct {
uint8_t foo;
} so58311426S;
typedef struct {
so58311426S *arr;
uintptr_t n;
uintptr_t cap;
} so58311426s_arr;
so58311426s_arr gen_s_arr(void);
This allows us to call gen_s_arr() from either C or Rust and retrieve a struct that is usable across both parts of the FFI boundary (so58311426s_arr). This struct contains all we need to be able to modify our array of S (well, so58311426S according to cbindgen).
When passing through FFI, you need to make sure of a few simple things:
You cannot pass raw boxes or non-primitive types; you will almost universally need to convert down to a set of pointers or change your definitions to accomodate (as I have done here)
You most definitely do not pass raw vectors. At most, you pass a slice, as that is a primitive type (see the point above).
You make sure to std::mem::forget() whatever you do not want to deallocate, and make sure to remember to deallocate it or reform it somewhere else.
I will edit this question in an hour; I have a plane to get on to. Let me know if any of this needs clarifications and I'll get to it once I'm in the right country :-)

Recommended way to wrap C lib initialization/destruction routine

I am writing a wrapper/FFI for a C library that requires a global initialization call in the main thread as well as one for destruction.
Here is how I am currently handling it:
struct App;
impl App {
fn init() -> Self {
unsafe { ffi::InitializeMyCLib(); }
App
}
}
impl Drop for App {
fn drop(&mut self) {
unsafe { ffi::DestroyMyCLib(); }
}
}
which can be used like:
fn main() {
let _init_ = App::init();
// ...
}
This works fine, but it feels like a hack, tying these calls to the lifetime of an unnecessary struct. Having the destructor in a finally (Java) or at_exit (Ruby) block seems theoretically more appropriate.
Is there some more graceful way to do this in Rust?
EDIT
Would it be possible/safe to use this setup like so (using the lazy_static crate), instead of my second block above:
lazy_static! {
static ref APP: App = App::new();
}
Would this reference be guaranteed to be initialized before any other code and destroyed on exit? Is it bad practice to use lazy_static in a library?
This would also make it easier to facilitate access to the FFI through this one struct, since I wouldn't have to bother passing around the reference to the instantiated struct (called _init_ in my original example).
This would also make it safer in some ways, since I could make the App struct default constructor private.

I know of no way of enforcing that a method be called in the main thread beyond strongly-worded documentation. So, ignoring that requirement... :-)
Generally, I'd use std::sync::Once, which seems basically designed for this case:
A synchronization primitive which can be used to run a one-time global
initialization. Useful for one-time initialization for FFI or related
functionality. This type can only be constructed with the ONCE_INIT
value.
Note that there's no provision for any cleanup; many times you just have to leak whatever the library has done. Usually if a library has a dedicated cleanup path, it has also been structured to store all that initialized data in a type that is then passed into subsequent functions as some kind of context or environment. This would map nicely to Rust types.
Warning
Your current code is not as protective as you hope it is. Since your App is an empty struct, an end-user can construct it without calling your method:
let _init_ = App;
We will use a zero-sized argument to prevent this. See also What's the Rust idiom to define a field pointing to a C opaque pointer? for the proper way to construct opaque types for FFI.
Altogether, I'd use something like this:
use std::sync::Once;
mod ffi {
extern "C" {
pub fn InitializeMyCLib();
pub fn CoolMethod(arg: u8);
}
}
static C_LIB_INITIALIZED: Once = Once::new();
#[derive(Copy, Clone)]
struct TheLibrary(());
impl TheLibrary {
fn new() -> Self {
C_LIB_INITIALIZED.call_once(|| unsafe {
ffi::InitializeMyCLib();
});
TheLibrary(())
}
fn cool_method(&self, arg: u8) {
unsafe { ffi::CoolMethod(arg) }
}
}
fn main() {
let lib = TheLibrary::new();
lib.cool_method(42);
}

I did some digging around to see how other FFI libs handle this situation. Here is what I am currently using (similar to #Shepmaster's answer and based loosely on the initialization routine of curl-rust):
fn initialize() {
static INIT: Once = ONCE_INIT;
INIT.call_once(|| unsafe {
ffi::InitializeMyCLib();
assert_eq!(libc::atexit(cleanup), 0);
});
extern fn cleanup() {
unsafe { ffi::DestroyMyCLib(); }
}
}
I then call this function inside the public constructors for my public structs.

Move semantics in Rust

I'm wrapping a C library in Rust, and many of its functions take parameters by pointers to structs, which themselves often have pointers to other structs. In the interest of reducing overhead, I'd like to provide the ability to cache the results of marshaling the Rust data into the C structs.
Here's an example of how the C library might expect some parameters:
#[repr(C)]
struct Foo {
x: i32,
y: f32
}
#[repr(C)]
struct Bar {
p_foo: *const Foo,
z: bool
}
And how I'd imagine an owning, "cached" version would look:
struct Cached {
foo: Option<Foo>,
bar: Bar
}
The p_foo field of bar would be constructed to point to Some value within foo, or a null pointer if there was None.
The issue, here, of course, is that if a value of Cached was to be moved, a straight memcpy would be inappropriate and bar.p_foo would additionally need to be redirected. This would be easy to ensure in C++, with its definable move semantics, but does Rust offer a solution besides "don't set bar.p_foo until it's used"? While it would certainly work to do it that way, I don't imagine that these cached values will be moved more than (or even close to the frequency that) they are reused, and there is a bit of work involved to set up these pointers, especially if the nesting/chaining is deep/long. I'd also rather not Box the substructures up on the heap.
To clarify, here's what I can write in C++, which I would like to replicate in Rust:
struct Foo {
int x;
float y;
};
struct Bar {
Foo const*pFoo;
bool z;
};
// bear with me while I conjure up a Maybe for C++
class Cached {
public:
// would have appropriate copy constructor/assignment
Cached(Cached &&other) {
m_foo = other.m_foo;
m_bar = other.m_bar;
if(m_foo.isJust()) {
m_bar.pFoo = &m_foo.value();
} // else already nullptr
}
// similar move assignment
private:
Maybe<Foo> m_foo;
Bar m_bar;
};

The Rust-equivalent would be to not use raw pointers, as raw pointers are there for implementing our safe datastructures, not for implementing normal datastructures.
#[repr(C)]
struct Foo {
x: i32,
y: f32
}
#[repr(C)]
struct Bar {
p_foo: Option<Box<Foo>>,
z: bool
}
An Option<Box<T>> is guaranteed to be exactly equivalent (in bits in memory) to a *const T, as long as T is a type and not a trait. The only difference is that it's safe to use within Rust.
This way you don't even need a Cached struct anymore, but can directly pass around the Bar object.
I'd also rather not Box the substructures up on the heap.
Then I suggest you don't keep a Bar object around, and instead conjure it up whenever you need to pass one to C:
#[repr(C)]
struct Foo {
x: i32,
y: f32
}
#[repr(C)]
struct Bar<'a> {
p_foo: Option<&'a Foo>,
z: bool
}
struct Cached {
foo: Option<Foo>,
z: bool,
}
impl Cached {
fn bar<'a>(&'a self) -> Bar<'a> {
Bar {
p_foo: self.foo.as_ref(),
z: self.z,
}
}
}
there is a bit of work involved to set up these pointers, especially if the nesting/chaining is deep/long.
That sounds a lot like premature optimization. Don't optimize where you haven't benchmarked.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Wrapping a Rust struct in a C++ class - rust

Related

JS style static property for Rust struct

Heterogenous containers in Rust for a graph

How do I use cbindgen to return and free a Box<Vec<_>>?

Recommended way to wrap C lib initialization/destruction routine

Move semantics in Rust

Categories

Resources