I am writing an OS in Rust and need to directly call into a virtual address that I'm calculating (of type u32). I expected this to be relatively simple:
let code = virtual_address as (extern "C" fn ());
(code)();
However, this complains that the cast is non-primitive. It suggests I use the From trait, but I don't see how this could help (although I am relatively new to Rust and so could be missing something).
error[E0605]: non-primitive cast: `u32` as `extern "C" fn()`
--> src/main.rs:3:16
|
3 | let code = virtual_address as (extern "C" fn ());
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: an `as` expression can only be used to convert between primitive types. Consider using the `From` trait
I have everything in libcore at my disposal, but haven't ported std and so can't rely on anything that isn't no_std
Casts of the type _ as f-ptr are not allowed (see the Rustonomicon chapter on casts). So, as far as I can tell, the only way to cast to function pointer types is to use the all mighty weapon mem::transmute().
But before we can use transmute(), we have to bring our input into the right memory layout. We do this by casting to *const () (a void pointer). Afterwards we can use transmute() to get what we want:
let ptr = virtual_address as *const ();
let code: extern "C" fn() = unsafe { std::mem::transmute(ptr) };
(code)();
If you find yourself doing this frequently, various kinds of macros can remove the boilerplate. One possibility:
macro_rules! example {
($address:expr, $t:ty) => {
std::mem::transmute::<*const (), $t>($address as _)
};
}
let f = unsafe { example!(virtual_address, extern "C" fn()) };
f();
However, a few notes on this:
If you, future reader, want to use this to do simple FFI things: please take a moment to think about it again. Calculating function pointers yourself is rarely necessary.
Usually extern "C" functions have the type unsafe extern "C" fn(). This means that those functions are unsafe to call. You should probably add the unsafe to your function.
Related
Please consider the following minimal example in Rust:
const FOOBAR: usize = 3;
trait Foo {
const BAR: usize;
}
struct Fubar();
impl Foo for Fubar {
const BAR: usize = 3;
}
struct Baz<T>(T);
trait Qux {
fn print_bar();
}
impl<T: Foo> Qux for Baz<T> {
fn print_bar() {
println!("bar: {}", T::BAR); // works
println!("{:?}", [T::BAR; 3]); // works
println!("{:?}", [1; FOOBAR]); // works
println!("{:?}", [1; T::BAR]); // this gives an error
}
}
fn main() {
Baz::<Fubar>::print_bar();
}
The compiler gives the following error:
error[E0599]: no associated item named `BAR` found for type `T` in the current scope
--> src/main.rs:24:30
|
24 | println!("{:?}", [1; T::BAR]); // this gives an error
| ^^^^^^ associated item not found in `T`
|
= help: items from traits can only be used if the trait is implemented and in scope
= note: the following trait defines an item `BAR`, perhaps you need to implement it:
candidate #1: `Foo`
Whatever the answer to my question, this is not a particularly good error message because it suggests that T does implement Foo despite the latter being a trait bound. Only after burning a lot of time did it occur to me that in fact T::BAR is a perfectly valid expression in other contexts, just not as a length parameter to an array.
What are the rules that govern what kind of expressions can go there? Because arrays are Sized, I completely understand that the length are to be known at compile time. Coming from C++ myself, I would expect some restriction akin to constexpr but I have not come across that in the documentation where it just says
A fixed-size array, denoted [T; N], for the element type, T, and the non-negative compile-time constant size, N.
As of Rust 1.24.1, the array length basically needs to either be a numeric literal or a "regular" constant that is a usize. There's a small amount of constant evaluation that exists today, but it's more-or-less limited to basic math.
a perfectly valid expression in other contexts, just not as a length parameter to an array
Array lengths don't support generic parameters. (#43408)
this is not a particularly good error message
Error message should be improved for associated consts in array lengths (#44168)
I would expect some restriction akin to constexpr
This is essentially the restriction, the problem is that what is allowed to be used in a const is highly restricted at the moment. Notably, these aren't allowed:
functions (except to construct enums or structs)
loops
multiple statements / blocks
Work on good constant / compile-time evaluation is still ongoing. There are a large amount of RFCs, issues, and PRs improving this. A sample:
Const fn tracking issue (RFC 911)
Allow locals and destructuring in const fn (RFC 2341)
Allow if and match in constants (RFC 2342)
I have an Option<&T> and I would like to have a raw *const T which is null if the option was None. I want to wrap an FFI call that takes a pointer to a Rust-allocated object.
Additionally, the FFI interface I am using has borrowing semantics (I allocate something and pass in a pointer to it), not ownership semantics
extern "C" {
// Parameter may be null
fn ffi_call(*const T);
}
fn safe_wrapper(opt: Option<&T>) {
let ptr: *const T = ???;
unsafe { ffi_call(ptr) }
}
I could use a match statement to do this, but that method feels very verbose.
let ptr = match opt {
Some(inner) => inner as *const T,
None => null(),
};
I could also map the reference to a pointer, then use unwrap_or.
let ptr = opt.map(|inner| inner as *const T).unwrap_or(null());
However, I'm worried that the pointer might be invalidated as it passes through the closure. Does Rust make a guarantee that the final pointer will point to the same thing as the original reference? If T is Copy, does this change the semantics in a meaningful way? Is there a better way that I am overlooking?
Yes, this is safe. I'd write it as:
use std::ptr;
fn safe_wrapper(opt: Option<&u8>) {
let p = opt.map_or_else(ptr::null, |x| x);
unsafe { ffi_call(p) }
}
If you find yourself writing this a lot, you could make it into a trait and reduce it down to a single method call.
the pointer might be invalidated as it passes through the closure
It could be, if you invalidate it yourself somehow. Because the function takes a reference, you know for sure that the referred-to value will be valid for the duration of the function call — that's the purpose of Rust's borrow checker.
The only way for the pointer to become invalid is if you change the value of the pointer (e.g. you add an offset to it). Since you don't do that, it's fine.
Does Rust make a guarantee that the final pointer will point to the same thing as the original reference?
It depends what you mean by "final". Converting a reference to a pointer will always result in both values containing the same location in memory. Anything else would be deliberately malicious and no one would ever have used Rust to begin with.
If T is Copy, does this change the semantics in a meaningful way?
No. Besides we are talking about a &T, which is always Copy
See also:
Convert Option<&mut T> to *mut T
Should we use Option or ptr::null to represent a null pointer in Rust?
Is it valid to use ptr::NonNull in FFI?
the FFI interface I am using has borrowing semantics (I allocate something and pass in a pointer to it), not ownership semantics
To be clear, you cannot determine ownership based purely on what the function types are.
This C function takes ownership:
void string_free(char *)
This C function borrows:
size_t string_len(char *)
Both take a pointer. Rust improves on this situation by clearly delineating what is a borrow and what is a transfer of ownership.
extern "C" {
// Parameter may be null
fn ffi_call(*const T);
}
This code is nonsensical; it does not define the generic type T and FFI functions cannot have generic types anyway.
I have managed to make the Rust type checker go into an infinite loop. A very similar program compiles with no trouble. Why does the program I want not compile?
To save your time and effort, I have made minimal versions of the two programs that isolate the problem. Of course, the minimal version is a pointless program. You'll have to use your imagination to see my motivation.
Success
Let me start with the version that works. The struct F<T> wraps a T. The type Target can be converted from an F<T> provided T can.
struct F<T>(T);
impl<T> From<F<T>> for Target where Target: From<T> {
fn from(a: F<T>) -> Target {
let b = Target::from(a.0);
f(&b)
}
}
Here's an example caller:
fn main() {
let x = Target;
let y = F(F(F(x)));
let z = Target::from(y);
println!("{:?}", z);
}
This runs and prints "Target".
Failure
The function f does not consume its argument. I would prefer it if the From conversion also did not consume its argument, because the type F<T> could be expensive or impossible to clone. I can write a custom trait FromRef that differs from std::convert::From by accepting an immutable borrow instead of an owned value:
trait FromRef<T> {
fn from_ref(a: &T) -> Self;
}
Of course, I ultimately want to use From<&'a T>, but by defining my own trait I can ask my question more clearly, without messing around with lifetime parameters. (The behaviour of the type-checker is the same using From<&'a T>).
Here's my implementation:
impl<T> FromRef<F<T>> for Target where Target: FromRef<T> {
fn from_ref(a: &F<T>) -> Target {
let b = Target::from_ref(&a.0);
f(&b)
}
}
This compiles. However, the main() function doesn't:
fn main() {
let x = Target;
let y = F(F(F(x)));
let z = Target::from_ref(y);
println!("{:?}", z);
}
It gives a huge error message beginning:
error[E0275]: overflow evaluating the requirement `_: std::marker::Sized`
--> <anon>:26:13
|
26 | let z = Target::from_ref(y);
| ^^^^^^^^^^^^^^^^
|
= note: consider adding a `#![recursion_limit="128"]` attribute to your crate
= note: required because of the requirements on the impl of `FromRef<F<_>>` for `Target`
= note: required because of the requirements on the impl of `FromRef<F<F<_>>>` for `Target`
= note: required because of the requirements on the impl of `FromRef<F<F<F<_>>>>` for `Target`
etc...
What am I doing wrong?
Update
I've randomly fixed it!
The problem was that I forgot to implement FromRef<Target> for Target.
So I would now like to know: what was the compiler thinking? I still can't relate the problem to the error message.
You can't avoid consuming the input in the standard From/Into traits.
They are defined to always consume the input. Their definition specifies both input and output as owned types, with unrelated lifetimes, so you can't even "cheat" by trying to consume a reference.
If you're returning a reference, you can implement AsRef<T> instead. Or if your type is a thin wrapper/smart pointer, Deref<T>. You can provide methods as_foo()
If you're returning a new (owned) object, the convention is to provide to_foo() methods.
I want to return a vector in a pub extern "C" fn. Since a vector has an arbitrary length, I guess I need to return a struct with
the pointer to the vector, and
the number of elements in the vector
My current code is:
extern crate libc;
use self::libc::{size_t, int32_t, int64_t};
// struct to represent an array and its size
#[repr(C)]
pub struct array_and_size {
values: int64_t, // this is probably not how you denote a pointer, right?
size: int32_t,
}
// The vector I want to return the address of is already in a Boxed struct,
// which I have a pointer to, so I guess the vector is on the heap already.
// Dunno if this changes/simplifies anything?
#[no_mangle]
pub extern "C" fn rle_show_values(ptr: *mut Rle) -> array_and_size {
let rle = unsafe {
assert!(!ptr.is_null());
&mut *ptr
};
// this is the Vec<i32> I want to return
// the address and length of
let values = rle.values;
let length = values.len();
array_and_size {
values: Box::into_raw(Box::new(values)),
size: length as i32,
}
}
#[derive(Debug, PartialEq)]
pub struct Rle {
pub values: Vec<i32>,
}
The error I get is
$ cargo test
Compiling ranges v0.1.0 (file:///Users/users/havpryd/code/rust-ranges)
error[E0308]: mismatched types
--> src/rle.rs:52:17
|
52 | values: Box::into_raw(Box::new(values)),
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected i64, found *-ptr
|
= note: expected type `i64`
= note: found type `*mut std::vec::Vec<i32>`
error: aborting due to previous error
error: Could not compile `ranges`.
To learn more, run the command again with --verbose.
-> exit code: 101
I posted the whole thing because I could not find an example of returning arrays/vectors in the eminently useful Rust FFI Omnibus.
Is this the best way to return a vector of unknown size from Rust? How do I fix my remaining compile error? Thanks!
Bonus q: if the fact that my vector is in a struct changes the answer, perhaps you could also show how to do this if the vector was not in a Boxed struct already (which I think means the vector it owns is on the heap too)? I guess many people looking up this q will not have their vectors boxed already.
Bonus q2: I only return the vector to view its values (in Python), but I do not want to let the calling code change the vector. But I guess there is no way to make the memory read-only and ensure the calling code does not fudge with the vector? const is just for showing intent, right?
Ps: I do not know C or Rust well, so my attempt might be completely WTF.
pub struct array_and_size {
values: int64_t, // this is probably not how you denote a pointer, right?
size: int32_t,
}
First of all, you're correct. The type you want for values is *mut int32_t.
In general, and note that there are a variety of C coding styles, C often doesn't "like" returning ad-hoc sized array structs like this. The more common C API would be
int32_t rle_values_size(RLE *rle);
int32_t *rle_values(RLE *rle);
(Note: many internal programs do in fact use sized array structs, but this is by far the most common for user-facing libraries because it's automatically compatible with the most basic way of representing arrays in C).
In Rust, this would translate to:
extern "C" fn rle_values_size(rle: *mut RLE) -> int32_t
extern "C" fn rle_values(rle: *mut RLE) -> *mut int32_t
The size function is straightforward, to return the array, simply do
extern "C" fn rle_values(rle: *mut RLE) -> *mut int32_t {
unsafe { &mut (*rle).values[0] }
}
This gives a raw pointer to the first element of the Vec's underlying buffer, which is all C-style arrays really are.
If, instead of giving C a reference to your data you want to give C the data, the most common option would be to allow the user to pass in a buffer that you clone the data into:
extern "C" fn rle_values_buf(rle: *mut RLE, buf: *mut int32_t, len: int32_t) {
use std::{slice,ptr}
unsafe {
// Make sure we don't overrun our buffer's length
if len > (*rle).values.len() {
len = (*rle).values.len()
}
ptr::copy_nonoverlapping(&(*rle).values[0], buf, len as usize);
}
}
Which, from C, looks like
void rle_values_buf(RLE *rle, int32_t *buf, int32_t len);
This (shallowly) copies your data into the presumably C-allocated buffer, which the C user is then responsible for destroying. It also prevents multiple mutable copies of your array from floating around at the same time (assuming you don't implement the version that returns a pointer).
Note that you could sort of "move" the array into C as well, but it's not particularly recommended and involves the use mem::forget and expecting the C user to explicitly call a destruction function, as well as requiring both you and the user to obey some discipline that may be difficult to structure the program around.
If you want to receive an array from C, you essentially just ask for both a *mut i32 and i32 corresponding to the buffer start and length. You can assemble this into a slice using the from_raw_parts function, and then use the to_vec function to create an owned Vector containing the values allocated from the Rust side. If you don't plan on needing to own the values, you can simply pass around the slice you produced via from_raw_parts.
However, it is imperative that all values be initialized from either side, typically to zero. Otherwise you invoke legitimately undefined behavior which often results in segmentation faults (which tend to frustratingly disappear when inspected with GDB).
There are multiple ways to pass an array to C.
First of all, while C has the concept of fixed-size arrays (int a[5] has type int[5] and sizeof(a) will return 5 * sizeof(int)), it is not possible to directly pass an array to a function or return an array from it.
On the other hand, it is possible to wrap a fixed size array in a struct and return that struct.
Furthermore, when using an array, all elements must be initialized, otherwise a memcpy technically has undefined behavior (as it is reading from undefined values) and valgrind will definitely report the issue.
Using a dynamic array
A dynamic array is an array whose length is unknown at compile-time.
One may chose to return a dynamic array if no reasonable upper-bound is known, or this bound is deemed too large for passing by value.
There are two ways to handle this situation:
ask C to pass a suitably sized buffer
allocate a buffer and return it to C
They differ in who allocates the memory: the former is simpler, but may require to either have a way to hint at a suitable size or to be able to "rewind" if the size proves unsuitable.
Ask C to pass a suitable sized buffer
// file.h
int rust_func(int32_t* buffer, size_t buffer_length);
// file.rs
#[no_mangle]
pub extern fn rust_func(buffer: *mut libc::int32_t, buffer_length: libc::size_t) -> libc::c_int {
// your code here
}
Note the existence of std::slice::from_raw_parts_mut to transform this pointer + length into a mutable slice (do initialize it with 0s before making it a slice or ask the client to).
Allocate a buffer and return it to C
// file.h
struct DynArray {
int32_t* array;
size_t length;
}
DynArray rust_alloc();
void rust_free(DynArray);
// file.rs
#[repr(C)]
struct DynArray {
array: *mut libc::int32_t,
length: libc::size_t,
}
#[no_mangle]
pub extern fn rust_alloc() -> DynArray {
let mut v: Vec<i32> = vec!(...);
let result = DynArray {
array: v.as_mut_ptr(),
length: v.len() as _,
};
std::mem::forget(v);
result
}
#[no_mangle]
pub extern fn rust_free(array: DynArray) {
if !array.array.is_null() {
unsafe { Box::from_raw(array.array); }
}
}
Using a fixed-size array
Similarly, a struct containing a fixed size array can be used. Note that both in Rust and C all elements should be initialized, even if unused; zeroing them works well.
Similarly to the dynamic case, it can be either passed by mutable pointer or returned by value.
// file.h
struct FixedArray {
int32_t array[32];
};
// file.rs
#[repr(C)]
struct FixedArray {
array: [libc::int32_t; 32],
}
Running this code
extern crate debug;
fn main() {
let x = &5i;
println!("{:?}", x);
}
prints &5. How is this data type called? I would have expected to see something like a pointer to int.
rustc 0.12.0-pre-nightly (09abbbdaf 2014-09-11 00:05:41 +0000)
The type of &5i is &int, which is called a reference to int, or maybe a shared reference, or an immutable reference.
Incidentally, "{:?}" yields a representation of the value based on reflection, not a representation of the type. The only way I'm aware of to get the name of a type is to involve it in a type error.
fn main() {
let i = &5i;
let () = i;
}
but even then you only get it to mention &int and &-ptr and not a politically correct, up-to-date in common community parlance, prose description.