Are raw pointers to temporaries ok in Rust? - rust

I have a function like this:
extern {
fn foo(layout: *const RawLayout) -> libc::uint8_t;
}
fn bar(layout: Layout) -> bool {
unsafe {
foo(&layout.into() as *const _) != 0
}
}
Where Layout is a copyable type that can be converted .into() a RawLayout.
I want to make sure I understand what is happening as it is unsafe. As I understand it, layout.into() creates a temporary RawLayout, then & takes a reference to it, and as *const _ converts it to a raw pointer (*const RawLayout). Then the foo() function is called and returns, and finally the temporary RawLayout is dropped.
Is that correct? Or is there some tricky reason why I shouldn't do this?

You are right. In this case, foo is called first and RawLayout is dropped afterwards. This is explained in The Rust Reference (follow the link to see concrete examples of how this works out in practice):
The lifetime of temporary values is typically the innermost enclosing
statement
However, I would rather follow Shepmaster's advice. Explicitly introducing a local variable would help the reader of the code concentrate in more important things, like ensuring that the unsafe code is correct (instead of having to figure out the exact semantics of temporary variables).
How to check this
You can use the code below to check this behavior:
struct Layout;
struct RawLayout;
impl Into<RawLayout> for Layout {
fn into(self) -> RawLayout {
RawLayout
}
}
impl Drop for RawLayout {
fn drop(&mut self) {
println!("Dropping RawLayout");
}
}
unsafe fn foo(layout: *const RawLayout) -> u8 {
println!("foo called");
1
}
fn bar(layout: Layout) -> bool {
unsafe {
foo(&layout.into() as *const _) != 0
}
}
fn main() {
bar(Layout);
}
The output is:
foo called
Dropping RawLayout

Related

Is it undefined behavior to do runtime borrow management with the help of raw pointers in Rust?

As part of binding a C API to Rust, I have a mutable reference ph: &mut Ph, a struct struct EnsureValidContext<'a> { ph: &'a mut Ph }, and some methods:
impl Ph {
pub fn print(&mut self, s: &str) {
/*...*/
}
pub fn with_context<F, R>(&mut self, ctx: &Context, f: F) -> Result<R, InvalidContextError>
where
F: Fn(EnsureValidContext) -> R,
{
/*...*/
}
/* some others */
}
impl<'a> EnsureValidContext<'a> {
pub fn print(&mut self, s: &str) {
self.ph.print(s)
}
pub fn close(self) {}
/* some others */
}
I don't control these. I can only use these.
Now, the closure API is nice if you want the compiler to force you to think about performance (and the tradeoffs you have to make between performance and the behaviour you want. Context validation is expensive). However, let's say you just don't care about that and want it to just work.
I was thinking of making a wrapper that handles it for you:
enum ValidPh<'a> {
Ph(&'a mut Ph),
Valid(*mut Ph, EnsureValidContext<'a>),
Poisoned,
}
impl<'a> ValidPh<'a> {
pub fn print(&mut self) {
/* whatever the case, just call .print() on the inner object */
}
pub fn set_context(&mut self, ctx: &Context) {
/*...*/
}
pub fn close(&mut self) {
/*...*/
}
/* some others */
}
This would work by, whenever necessary, checking if we're a Ph or a Valid, and if we're a Ph we'd upgrade to a Valid by going:
fn upgrade(&mut self) {
if let Ph(_) = self { // don't call mem::replace unless we need to
if let Ph(ph) = mem::replace(self, Poisoned) {
let ptr = ph as *mut _;
let evc = ph.with_context(ph.get_context(), |evc| evc);
*self = Valid(ptr, evc);
}
}
}
Downgrading is different for each method, as it has to call the target method, but here's an example close:
pub fn close(&mut self) {
if let Valid(_, _) = self {
/* ok */
} else {
self.upgrade()
}
if let Valid(ptr, evc) = mem::replace(self, Invalid) {
evc.close(); // consume the evc, dropping the borrow.
// we can now use our original borrow, but since we don't have it anymore, bring it back using our trusty ptr
*self = unsafe { Ph(&mut *ptr) };
} else {
// this can only happen due to a bug in our code
unreachable!();
}
}
You get to use a ValidPh like:
/* given a &mut vph */
vph.print("hello world!");
if vph.set_context(ctx) {
vph.print("closing existing context");
vph.close();
}
vph.print("opening new context");
vph.open("context_name");
vph.print("printing in new context");
Without vph, you'd have to juggle &mut Ph and EnsureValidContext around on your own. While the Rust compiler makes this trivial (just follow the errors), you may want to let the library handle it automatically for you. Otherwise you might end up just calling the very expensive with_context for every operation, regardless of whether the operation can invalidate the context or not.
Note that this code is rough pseudocode. I haven't compiled or tested it yet.
One might argue I need an UnsafeCell or a RefCell or some other Cell. However, from reading this it appears UnsafeCell is only a lang item because of interior mutability — it's only necessary if you're mutating state through an &T, while in this case I have &mut T all the way.
However, my reading may be flawed. Does this code invoke UB?
(Full code of Ph and EnsureValidContext, including FFI bits, available here.)
Taking a step back, the guarantees upheld by Rust are:
&T is a reference to T which is potentially aliased,
&mut T is a reference to T which is guaranteed not to be aliased.
The crux of the question therefore is: what does guaranteed not to be aliased means?
Let's consider a safe Rust sample:
struct Foo(u32);
impl Foo {
fn foo(&mut self) { self.bar(); }
fn bar(&mut self) { *self.0 += 1; }
}
fn main() { Foo(0).foo(); }
If we take a peek at the stack when Foo::bar is being executed, we'll see at least two pointers to Foo: one in bar and one in foo, and there may be further copies on the stack or in other registers.
So, clearly, there are aliases in existence. How come! It's guaranteed NOT to be aliased!
Take a deep breath: how many of those aliases can you access at the time?
Only 1. The guarantee of no aliasing is not spatial but temporal.
I would think, therefore, that at any point in time, if a &mut T is accessible, then no other reference to this instance must be accessible.
Having a raw pointer (*mut T) is perfectly fine, it requires unsafe to access; however forming a second reference may or may not be safe, even without using it, so I would avoid it.
Rust's memory model is not rigorously defined yet, so it's hard to say for sure, but I believe it's not undefined behavior to:
carry a *mut Ph around while a &'a mut Ph is also reachable from another path, so long as you don't dereference the *mut Ph, even just for reading, and don't convert it to a &Ph or &mut Ph, because mutable references grant exclusive access to the pointee.
cast the *mut Ph back to a &'a mut Ph once the other &'a mut Ph falls out of scope.

How do I declare a static variable as a reference to a hard-coded memory address?

I am working on embedded Rust code for LPC82X series controllers from NXP - the exact toolchain does not matter for the question.
These controllers contain peripheral drivers in ROM. I want to use these drivers, which means I need to use unsafe Rust and FFI without linking actual code.
The ROM APIs expose function pointers packed into C structs at specific address locations. If somebody wants the details of this API, chapter 29 of the LPC82X manual describes the API in question.
My Rust playground dummy sketch looks like this, that would be hidden from application code, by a yet unwritten I2C abstraction lib. This compiles.
#![feature(naked_functions)]
const I2C_ROM_API_ADDRESS: usize = 0x1fff_200c;
static mut ROM_I2C_API: Option<&RomI2cApi> = None;
#[repr(C)]
struct RomI2cApi {
// Dummy functions, real ones take arguments, and have different return
// These won't be called directly, only through the struct's implemented methods
// value
master_transmit_poll: extern "C" fn() -> bool,
master_receive_poll: extern "C" fn() -> bool,
}
impl RomI2cApi {
fn api_table() -> &'static RomI2cApi {
unsafe {
match ROM_I2C_API {
None => RomI2cApi::new(),
Some(table) => table,
}
}
}
unsafe fn new() -> &'static RomI2cApi {
ROM_I2C_API = Some(&*(I2C_ROM_API_ADDRESS as *const RomI2cApi));
ROM_I2C_API.unwrap()
}
#[inline]
fn master_transmit_poll(&self) -> bool {
(self.master_transmit_poll)()
}
#[inline]
fn master_receive_poll(&self) -> bool {
(self.master_receive_poll)()
}
}
impl From<usize> for &'static RomI2cApi {
fn from(address: usize) -> &'static RomI2cApi {
unsafe { &*(address as *const RomI2cApi) }
}
}
fn main() {
let rom_api = unsafe { RomI2cApi::api_table() };
println!("ROM I2C API address is: {:p}", rom_api);
// Should be commented out when trying !
rom_api.master_transmit_poll();
}
I cannot declare the function pointer structs as non-mutable static as statics have many restrictions, including not dereferencing pointers in the assignment. Is there a better workaround than Option? Using Option with the api_table function at least guarantees that initialization happens.
You can get around having a static at all:
const ROM_I2C_API: &RomI2cApi = &*(0x1fff_200c as *const RomI2cApi);
Not yet working, but is planned to work in the future. For now use
const ROM_I2C_API: *const RomI2cApi = 0x1fff_200c as *const RomI2cApi;
fn api_table() -> &'static RomI2cApi {
unsafe { &*(ROM_I2C_API) }
}
This creates a &'static RomI2cApi and allows you to access the functions everywhere directly by calling api_table().master_transmit_poll()

Rc<Trait> to Option<T>?

I'm trying to implement a method that looks like:
fn concretify<T: Any>(rc: Rc<Any>) -> Option<T> {
Rc::try_unwrap(rc).ok().and_then(|trait_object| {
let b: Box<Any> = unimplemented!();
b.downcast().ok().map(|b| *b)
})
}
However, try_unwrap doesn't work on trait objects (which makes sense, as they're unsized). My next thought was to try to find some function that unwraps Rc<Any> into Box<Any> directly. The closest thing I could find would be
if Rc::strong_count(&rc) == 1 {
Some(unsafe {
Box::from_raw(Rc::into_raw(rc))
})
} else {
None
}
However, Rc::into_raw() appears to require that the type contained in the Rc to be Sized, and I'd ideally not like to have to use unsafe blocks.
Is there any way to implement this?
Playground Link, I'm looking for an implementation of rc_to_box here.
Unfortunately, it appears that the API of Rc is lacking the necessary method to be able to get ownership of the wrapped type when it is !Sized.
The only method which may return the interior item of a Rc is Rc::try_unwrap, however it returns Result<T, Rc<T>> which requires that T be Sized.
In order to do what you wish, you would need to have a method with a signature: Rc<T> -> Result<Box<T>, Rc<T>>, which would allow T to be !Sized, and from there you could extract Box<Any> and perform the downcast call.
However, this method is impossible due to how Rc is implemented. Here is a stripped down version of Rc:
struct RcBox<T: ?Sized> {
strong: Cell<usize>,
weak: Cell<usize>,
value: T,
}
pub struct Rc<T: ?Sized> {
ptr: *mut RcBox<T>,
_marker: PhantomData<T>,
}
Therefore, the only Box you can get out of Rc<T> is Box<RcBox<T>>.
Note that the design is severely constrained here:
single-allocation mandates that all 3 elements be in a single struct
T: ?Sized mandates that T be the last field
so there is little room for improvement in general.
However, in your specific case, it is definitely possible to improve on the generic situation. It does, of course, require unsafe code. And while it works fairly well with Rc, implementing it with Arc would be complicated by the potential data-races.
Oh... and the code is provided as is, no warranty implied ;)
use std::any::Any;
use std::{cell, mem, ptr};
use std::rc::Rc;
struct RcBox<T: ?Sized> {
strong: cell::Cell<usize>,
_weak: cell::Cell<usize>,
value: T,
}
fn concretify<T: Any>(rc: Rc<Any>) -> Option<T> {
// Will be responsible for freeing the memory if there is no other weak
// pointer by the end of this function.
let _guard = Rc::downgrade(&rc);
unsafe {
let killer: &RcBox<Any> = {
let killer: *const RcBox<Any> = mem::transmute(rc);
&*killer
};
if killer.strong.get() != 1 { return None; }
// Do not forget to decrement the count if we do take ownership,
// as otherwise memory will not get released.
let result = killer.value.downcast_ref().map(|r| {
killer.strong.set(0);
ptr::read(r as *const T)
});
// Do not forget to destroy the content of the box if we did not
// take ownership
if result.is_none() {
let _: Rc<Any> = mem::transmute(killer as *const RcBox<Any>);
}
result
}
}
fn main() {
let x: Rc<Any> = Rc::new(1);
println!("{:?}", concretify::<i32>(x));
}
I don't think it's possible to implement your concretify function if you're expecting it to move the original value back out of the Rc; see this question for why.
If you're willing to return a clone, it's straightforward:
fn concretify<T: Any+Clone>(rc: Rc<Any>) -> Option<T> {
rc.downcast_ref().map(Clone::clone)
}
Here's a test:
#[derive(Debug,Clone)]
struct Foo(u32);
#[derive(Debug,Clone)]
struct Bar(i32);
fn main() {
let rc_foo: Rc<Any> = Rc::new(Foo(42));
let rc_bar: Rc<Any> = Rc::new(Bar(7));
let foo: Option<Foo> = concretify(rc_foo);
println!("Got back: {:?}", foo);
let bar: Option<Foo> = concretify(rc_bar);
println!("Got back: {:?}", bar);
}
This outputs:
Got back: Some(Foo(42))
Got back: None
Playground
If you want something more "movey", and creating your values is cheap, you could also make a dummy, use downcast_mut() instead of downcast_ref(), and then std::mem::swap with the dummy.

No method/field name found

I'm trying to apply some OOP but I'm facing a problem.
use std::io::Read;
struct Source {
look: char
}
impl Source {
fn new() {
Source {look: '\0'};
}
fn get_char(&mut self) {
self.look = 'a';
}
}
fn main() {
let src = Source::new();
src.get_char();
println!("{}", src.look);
}
Compiler reports these errors, for src.get_char();:
error: no method named get_char found for type () in the current
scope
and for println!("{}", src.look);:
attempted access of field look on type (), but no field with that
name was found
I can't find out what I've missed.
Source::new has no return type specified, and thus returns () (the empty tuple, also called unit).
As a result, src has type (), which does not have a get_char method, which is what the error message is telling you.
So, first things first, let's set a proper signature for new: fn new() -> Source. Now we get:
error: not all control paths return a value [E0269]
fn new() -> Source {
Source {look: '\0'};
}
This is caused because Rust is an expression language, nearly everything is an expression, unless a semicolon is used to transform the expression into a statement. You can write new either:
fn new() -> Source {
return Source { look: '\0' };
}
Or:
fn new() -> Source {
Source { look: '\0' } // look Ma, no semi-colon!
}
The latter being the more idiomatic in Rust.
So, let's do that, now we get:
error: cannot borrow immutable local variable `src` as mutable
src.get_char();
^~~
Which is because src is declared immutable (the default), for it to be mutable you need to use let mut src.
And now it all works!
Final code:
use std::io::Read;
struct Source {
look: char
}
impl Source {
fn new() -> Source {
Source {look: '\0'}
}
fn get_char(&mut self) {
self.look = 'a';
}
}
fn main() {
let mut src = Source::new();
src.get_char();
println!("{}", src.look);
}
Note: there is a warning because std::io::Read is unused, but I assume you plan to use it.

How do I obtain the address of a function?

How do I obtain a function address in Rust? What does '&somefunction' exactly mean?
What addresses do I get doing
std::mem::transmute::<_, u32>(function)
or
std::mem::transmute::<_, u32>(&function)
(on 32-bit system, of course)?
What does
&function as *const _ as *const c_void
give?
If I just wanted to know the address of a function, I'd probably just print it out:
fn moo() {}
fn main() {
println!("{:p}", moo as *const ());
}
However, I can't think of a useful reason to want to do this. Usually, there's something you want to do with the function. In those cases, you might as well just pass the function directly, no need to deal with the address:
fn moo() {}
fn do_moo(f: fn()) {
f()
}
fn main() {
do_moo(moo);
}
I'm less sure about this, but I think that std::mem::transmute::<_, u32>(&function) would just create a local variable that points to function and then gets the reference to that variable. This would match how this code works:
fn main() {
let a = &42;
}
I need not to work with them in Rust, I need an address because I have some FFI that takes an address of the symbol in the current process
You can still just pass the function as-is to the extern functions that will use the callback:
extern {
fn a_thing_that_does_a_callback(callback: extern fn(u8) -> bool);
}
extern fn zero(a: u8) -> bool { a == 0 }
fn main() {
unsafe { a_thing_that_does_a_callback(zero); }
}
The answer by #Shepmaster gives answer to this question (though giving also other not relevant but may be useful for somebody information). I summarize it here.
Obtaining address is easy, just
funct as *const ()
Reference to a function seems to create a local variable just like with
let a = &42;

Resources