Change mutability of function argument depending on const generic - rust

I am developing a CPU emulator in Rust. During normal execution, reading certain memory addresses has side-effects, therefore a &mut should be passed to the read function to allow for mutation of state.
However, for debugging purposes, I also have a trace unit which is given an immutable reference to the CPU and memory (reading should have no side-effects). At the moment, the read function looks something like this:
pub fn read<const SIDE_EFFECTS: bool>(memory: &mut CMem, addr: u16) -> u8 {
match addr {
0x0000..=0x1FFF => {
if SIDE_EFFECTS {
/* do something to memory with side-effects */
} else {
/* only "observe" something in memory */
}
}
}
}
This function is OK, but actually, the case when SIDE_EFFECTS is false is somewhat redundant as the function signature itself requires a &mut irrespective of the value of SIDE_EFFECTS. What I really want is to have a parameter list conditional on the const generic. Something like (not real Rust code):
pub fn read<const SIDE_EFFECTS: bool>(memory: T, addr: u16) -> u8
where T: if SIDE_EFFECTS { DerefMut<Target = CMem> } else { Deref<Target = CMem> } {
match addr {
0x0000..=0x1FFF => {
if SIDE_EFFECTS {
/* do something to memory with side-effects */
} else {
/* only "observe" something in memory */
}
}
}
}
Is something like this possible, or will I need to redesign?
Thanks

Related

Rust type that requires manual drop

Is there a way to have the Rust compiler error if a struct is to be automatically dropped?
Example: I'm implementing a memory pool and want the allocations to be manually returned to the memory pool to prevent leakage. Is there something like RequireManualDrop from the below example?
impl MemoryPool {
pub fn allocate(&mut self) -> Option<Allocation> { /* ... */ }
pub fn free(&mut self, alloc: Allocation) { let (ptr, size) = alloc.inner.into_inner(); /* ... */ }
}
pub struct Allocation {
inner: RequireManualDrop<(*mut u8, usize)>,
}
fn valid_usage(mem: &mut MemoryPool) {
let chunk = mem.allocate();
/* ... */
mem.free(chunk);
}
/* compile error: Allocation.inner needs to be manully dropped */
fn will_have_compile_error(mem: &mut MemoryPool) {
let chunk = mem.allocate();
/* ... */
}
Rust doesn't have a direct solution for this. However, you can use a nit trick for that (this trick is taken from https://github.com/Kixunil/dont_panic).
If you call a declared but not defined function, the linker will report an error. We can use that to error on drop. By calling an undefined function in the destructor, the code will not compile unless the destructor is not called.
#[derive(Debug)]
pub struct Undroppable(pub i32);
impl Drop for Undroppable {
fn drop(&mut self) {
extern "C" {
// This will show (somewhat) useful error message instead of complete gibberish
#[link_name = "\n\nERROR: `Undroppable` implicitly dropped\n\n"]
fn trigger() -> !;
}
unsafe { trigger() }
}
}
pub fn manually_drop(v: Undroppable) {
let v = std::mem::ManuallyDrop::new(v);
println!("Manually dropping {v:?}...");
}
Playground.
However, beware that while in this case it worked even in debug builds, it may require optimizations in other cases to eliminate the call. And unwinding may complicate the process even more, because it can lead to unexpected implicit drops. For example, if I change manually_drop() to the following seemingly identical version:
pub fn manually_drop(v: Undroppable) {
println!("Manually dropping {v:?}...");
std::mem::forget(v);
}
It doesn't work, because println!() may unwind, and then std::mem::forget(v) won't be reached. If a function that takes Undroppable by value must unwind, you probably have no options but to wrap it with ManuallyDrop and manually ensure it is correctly dropped (though you can just drop it at the end of the function, and let the unwind case leak it, as leaking in panicking is mostly fine).

Is there a Rust equivalent to C++'s operator bool() to convert a struct into a boolean?

In C++, we can overload operator bool() to convert a struct to bool:
struct Example {
explicit operator bool() const {
return false;
}
};
int main() {
Example a;
if (a) { /* some work */ }
}
Can we do something simple (and elegant) in Rust so to:
pub struct Example {}
fn main() {
let k = Example {};
if k {
// some work
}
}
There's no direct equivalent of operator bool(). A close alternative would be to implement From (which will also implement Into) and call the conversion explicitly:
pub struct Example;
impl From<Example> for bool {
fn from(_other: Example) -> bool {
false
}
}
fn main() {
let k = Example;
if k.into() {
// some work
}
}
This will take ownership of Example, meaning you can't use k after it's been converted. You could implement it for a reference (impl From<&Example> for bool) but then the call site becomes uglier ((&k).into()).
I'd probably avoid using From / Into for this case. Instead, I'd create a predicate method on the type. This will be more readable and can take &self, allowing you to continue using the value.
See also:
When should I implement std::convert::From vs std::convert::Into?
Rust does not have C++'s implicit type conversion via operator overloading. The closest means of implicit conversion is through a Deref impl, which provides a reference of a different type.
What is possible, albeit not necessarily idiomatic, is to implement the not operator ! so that it returns a boolean value, and perform the not operation twice when needed.
use std::ops::Not;
pub struct Example;
impl Not for Example {
type Output = bool;
fn not(self) -> bool { false }
}
fn main() {
let k = Example;
if !!k {
println!("OK!");
} else {
println!("WAT");
}
}
Playground
You have a few options, but I'd go for one of these:
Into<bool> (From<Example>)
If your trait conceptually represents a bool, but maybe with some extra metadata, you can implement From<Example> for bool:
impl From<Example> for bool {
fn from(e: Example) {
// perform the conversion
}
}
Then you can:
fn main() {
let x = Example { /* ... */ };
if x.into() {
// ...
}
}
Custom method
If your type doesn't really represent a boolean value, I'd usually go for an explicit method:
impl Example {
fn has_property() -> bool { /* ... */ }
}
This makes it more obvious what the intent is, for example, if you implemented From<User> for bool:
fn main() {
let user = User { /* ... */ };
if user.into() {
// when does this code get run??
}
// compared to
if user.logged_in() {
// much clearer
}
}
You can implement std::ops::Deref with the bool type. If you do that, you have to call *k to get the boolean.
This is not recommended though, according to the Rust documentation:
On the other hand, the rules regarding Deref and DerefMut were designed specifically to accommodate smart pointers. Because of this, Deref should only be implemented for smart pointers to avoid confusion.
struct Example {}
impl std::ops::Deref for Example {
type Target = bool;
fn deref(&self) -> &Self::Target {
&true
}
}
fn main() {
let k = Example {};
if *k {
// some work
}
}
Playground

Convenient 'Option<Box<Any>>' access when success is assured?

When writing callbacks for generic interfaces, it can be useful for them to define their own local data which they are responsible for creating and accessing.
In C I would just use a void pointer, C-like example:
struct SomeTool {
int type;
void *custom_data;
};
void invoke(SomeTool *tool) {
StructOnlyForThisTool *data = malloc(sizeof(*data));
/* ... fill in the data ... */
tool.custom_data = custom_data;
}
void execute(SomeTool *tool) {
StructOnlyForThisTool *data = tool.custom_data;
if (data.foo_bar) { /* do something */ }
}
When writing something similar in Rust, replacing void * with Option<Box<Any>>, however I'm finding that accessing the data is unreasonably verbose, eg:
struct SomeTool {
type: i32,
custom_data: Option<Box<Any>>,
};
fn invoke(tool: &mut SomeTool) {
let data = StructOnlyForThisTool { /* my custom data */ }
/* ... fill in the data ... */
tool.custom_data = Some(Box::new(custom_data));
}
fn execute(tool: &mut SomeTool) {
let data = tool.custom_data.as_ref().unwrap().downcast_ref::<StructOnlyForThisTool>().unwrap();
if data.foo_bar { /* do something */ }
}
There is one line here which I'd like to be able to write in a more compact way:
tool.custom_data.as_ref().unwrap().downcast_ref::<StructOnlyForThisTool>().unwrap()
tool.custom_data.as_ref().unwrap().downcast_mut::<StructOnlyForThisTool>().unwrap()
While each method makes sense on its own, in practice it's not something I'd want to write throughout a code-base, and not something I'm going to want to type out often or remember easily.
By convention, the uses of unwrap here aren't dangerous because:
While only some tools define custom data, the ones that do always define it.
When the data is set, by convention the tool only ever sets its own data. So there is no chance of having the wrong data.
Any time these conventions aren't followed, its a bug and should panic.
Given these conventions, and assuming accessing custom-data from a tool is something that's done often - what would be a good way to simplify this expression?
Some possible options:
Remove the Option, just use Box<Any> with Box::new(()) representing None so access can be simplified a little.
Use a macro or function to hide verbosity - passing in the Option<Box<Any>>: will work of course, but prefer not - would use as a last resort.
Add a trait to Option<Box<Any>> which exposes a method such as tool.custom_data.unwrap_box::<StructOnlyForThisTool>() with matching unwrap_box_mut.
Update 1): since asking this question a point I didn't include seems relevant.
There may be multiple callback functions like execute which must all be able to access the custom_data. At the time I didn't think this was important to point out.
Update 2): Wrapping this in a function which takes tool isn't practical, since the borrow checker then prevents further access to members of tool until the cast variable goes out of scope, I found the only reliable way to do this was to write a macro.
If the implementation really only has a single method with a name like execute, that is a strong indication to consider using a closure to capture the implementation data. SomeTool can incorporate an arbitrary callable in a type-erased manner using a boxed FnMut, as shown in this answer. execute() then boils down to invoking the closure stored in the struct field implementation closure using (self.impl_)(). For a more general approach, that will also work when you have more methods on the implementation, read on.
An idiomatic and type-safe equivalent of the type+dataptr C pattern is to store the implementation type and pointer to data together as a trait object. The SomeTool struct can contain a single field, a boxed SomeToolImpl trait object, where the trait specifies tool-specific methods such as execute. This has the following characteristics:
You no longer need an explicit type field because the run-time type information is incorporated in the trait object.
Each tool's implementation of the trait methods can access its own data in a type-safe manner without casts or unwraps. This is because the trait object's vtable automatically invokes the correct function for the correct trait implementation, and it is a compile-time error to try to invoke a different one.
The "fat pointer" representation of the trait object has the same performance characteristics as the type+dataptr pair - for example, the size of SomeTool will be two pointers, and accessing the implementation data will still involve a single pointer dereference.
Here is an example implementation:
struct SomeTool {
impl_: Box<SomeToolImpl>,
}
impl SomeTool {
fn execute(&mut self) {
self.impl_.execute();
}
}
trait SomeToolImpl {
fn execute(&mut self);
}
struct SpecificTool1 {
foo_bar: bool
}
impl SpecificTool1 {
pub fn new(foo_bar: bool) -> SomeTool {
let my_data = SpecificTool1 { foo_bar: foo_bar };
SomeTool { impl_: Box::new(my_data) }
}
}
impl SomeToolImpl for SpecificTool1 {
fn execute(&mut self) {
println!("I am {}", self.foo_bar);
}
}
struct SpecificTool2 {
num: u64
}
impl SpecificTool2 {
pub fn new(num: u64) -> SomeTool {
let my_data = SpecificTool2 { num: num };
SomeTool { impl_: Box::new(my_data) }
}
}
impl SomeToolImpl for SpecificTool2 {
fn execute(&mut self) {
println!("I am {}", self.num);
}
}
pub fn main() {
let mut tool1: SomeTool = SpecificTool1::new(true);
let mut tool2: SomeTool = SpecificTool2::new(42);
tool1.execute();
tool2.execute();
}
Note that, in this design, it doesn't make sense to make implementation an Option because we always associate the tool type with the implementation. While it is perfectly valid to have an implementation without data, it must always have a type associated with it.

Drop a Rust void pointer stored in an FFI

I'm wrapping a C API which allows the caller to set/get an arbitrary pointer via function calls. In this way, the C API allows a caller to associate arbitrary data with one of the C API objects. This data is not used in any callbacks, it's just a pointer that a user can stash away and get at later.
My wrapper struct implements the Drop trait for the C object that contains this pointer. What I'd like to be able to do, but am not sure it's possible, is have the data dropped correctly if the pointer is not null when the wrapper struct drops. I'm not sure how I would recover the correct type though from a raw c_void pointer.
Two alternatives I'm thinking of are
Implement the behavior of these two calls in the wrapper. Don't make any calls to the C API.
Don't attempt to offer any kind of safer interface to these functions. Document that the pointer must be managed by the caller of the wrapper.
Is what I want to do possible? If not, is there a generally accepted practice for these kinds of situations?
A naive + fully automatic approach is NOT possible for the following reasons:
freeing memory does not call drop/deconstructors/...: the C API can be used from languages which can have objects which should be deconstructed properly, e.g. C++ or Rust itself. So when you only store a memory pointer you do not know you to call the proper function (you neither know which function not how the calling conventions look like).
which memory allocator?: memory allocation and deallocation isn't a trivial thing. your program needs to request memory from the OS and then manage this resources in an intelligent way to be efficient and correct. This is usually done by a library. In case of Rust, jemalloc is used (but can be changed). So even when you ask the API caller to only pass Plain Old Data (which should be easier to destruct) you still don't know which library function to call to deallocate memory. Just using libc::free won't work (it can but it could horrible fail).
Solutions:
dealloc callback: you can ask the API user to set an additional pointer to, let's say a void destruct(void* ptr) function. If this one is not NULL, you call that function during your drop. You could also use int as an return type to signal when the destruction went wrong. In that case you could for example panic!.
global callback: let's assume you requested your user to only pass POD (plain old data). To know which free function of the memory allocator to call, you could request the user to register a global void (*free)(void* ptr) pointer which is called during drop. You could also make that one optional.
Although I was able to follow the advice in this thread, I wasn't entirely satisfied with my results, so I asked the question on the Rust forums and found the answer I was really looking for. (play)
use std::any::Any;
static mut foreign_ptr: *mut () = 0 as *mut ();
unsafe fn api_set_fp(ptr: *mut ()) {
foreign_ptr = ptr;
}
unsafe fn api_get_fp() -> *mut() {
foreign_ptr
}
struct ApiWrapper {}
impl ApiWrapper {
fn set_foreign<T: Any>(&mut self, value: Box<T>) {
self.free_foreign();
unsafe {
let raw = Box::into_raw(Box::new(value as Box<Any>));
api_set_fp(raw as *mut ());
}
}
fn get_foreign_ref<T: Any>(&self) -> Option<&T> {
unsafe {
let raw = api_get_fp() as *const Box<Any>;
if !raw.is_null() {
let b: &Box<Any> = &*raw;
b.downcast_ref()
} else {
None
}
}
}
fn get_foreign_mut<T: Any>(&mut self) -> Option<&mut T> {
unsafe {
let raw = api_get_fp() as *mut Box<Any>;
if !raw.is_null() {
let b: &mut Box<Any> = &mut *raw;
b.downcast_mut()
} else {
None
}
}
}
fn free_foreign(&mut self) {
unsafe {
let raw = api_get_fp() as *mut Box<Any>;
if !raw.is_null() {
Box::from_raw(raw);
}
}
}
}
impl Drop for ApiWrapper {
fn drop(&mut self) {
self.free_foreign();
}
}
struct MyData {
i: i32,
}
impl Drop for MyData {
fn drop(&mut self) {
println!("Dropping MyData with value {}", self.i);
}
}
fn main() {
let p1 = Box::new(MyData {i: 1});
let mut api = ApiWrapper{};
api.set_foreign(p1);
{
let p2 = api.get_foreign_ref::<MyData>().unwrap();
println!("i is {}", p2.i);
}
api.set_foreign(Box::new("Hello!"));
{
let p3 = api.get_foreign_ref::<&'static str>().unwrap();
println!("payload is {}", p3);
}
}

Detecting new struct initialization

I'm coming from mostly OOP languages, so getting this concept to work in Rust kinda seems hard. I want to implement a basic counter that keeps count of how many "instances" I've made of that type, and keep them in a vector for later use.
I've tried many different things, first was making a static vector variable, but that cant be done due to it not allowing static stuff that have destructors.
This was my first try:
struct Entity {
name: String,
}
struct EntityCounter {
count: i64,
}
impl Entity {
pub fn init() {
let counter = EntityCounter { count: 0 };
}
pub fn new(name: String) {
println!("Entity named {} was made.", name);
counter += 1; // counter variable unaccessable (is there a way to make it global to the struct (?..idek))
}
}
fn main() {
Entity::init();
Entity::new("Hello".to_string());
}
Second:
struct Entity {
name: String,
counter: i32,
}
impl Entity {
pub fn new(self) {
println!("Entity named {} was made.", self.name);
self.counter = self.counter + 1;
}
}
fn main() {
Entity::new(Entity { name: "Test".to_string() });
}
None of those work, I was just trying out some concepts on how I could be able to implement such a feature.
Your problems appear to be somewhat more fundamental than what you describe. You're kind of throwing code at the wall to see what sticks, and that's simply not going to get you anywhere. I'd recommend reading the Rust Book completely before continuing. If you don't understand something in it, ask about it. As it stands, you're demonstrating you don't understand variable scoping, return types, how instance construction works, how statics work, and how parameters are passed. That's a really shaky base to try and build any understanding on.
In this particular case, you're asking for something that's deliberately not straightforward. You say you want a counter and a vector of instances. The counter is simple enough, but a vector of instances? Rust doesn't allow easy sharing like other languages, so how you go about doing that depends heavily on what it is you're actually intending to use this for.
What follows is a very rough guess at something that's maybe vaguely similar to what you want.
/*!
Because we need the `lazy_static` crate, you need to add the following to your
`Cargo.toml` file:
```cargo
[dependencies]
lazy_static = "0.2.1"
```
*/
#[macro_use] extern crate lazy_static;
mod entity {
use std::sync::{Arc, Weak, Mutex};
use std::sync::atomic;
pub struct Entity {
pub name: String,
}
impl Entity {
pub fn new(name: String) -> Arc<Self> {
println!("Entity named {} was made.", name);
let ent = Arc::new(Entity {
name: name,
});
bump_counter();
remember_instance(ent.clone());
ent
}
}
/*
The counter is simple enough, though I'm not clear on *why* you even want
it in the first place. You don't appear to be using it for anything...
*/
static COUNTER: atomic::AtomicUsize = atomic::ATOMIC_USIZE_INIT;
fn bump_counter() {
// Add one using the most conservative ordering.
COUNTER.fetch_add(1, atomic::Ordering::SeqCst);
}
pub fn get_counter() -> usize {
COUNTER.load(atomic::Ordering::SeqCst)
}
/*
There are *multiple* ways of doing this part, and you simply haven't given
enough information on what it is you're trying to do. This is, at best,
a *very* rough guess.
`Mutex` lets us safely mutate the vector from any thread, and `Weak`
prevents `INSTANCES` from keeping every instance alive *forever*. I mean,
maybe you *want* that, but you didn't specify.
Note that I haven't written a "cleanup" function here to remove dead weak
references.
*/
lazy_static! {
static ref INSTANCES: Mutex<Vec<Weak<Entity>>> = Mutex::new(vec![]);
}
fn remember_instance(entity: Arc<Entity>) {
// Downgrade to a weak reference. Type constraint is just for clarity.
let entity: Weak<Entity> = Arc::downgrade(&entity);
INSTANCES
// Lock mutex
.lock().expect("INSTANCES mutex was poisoned")
// Push entity
.push(entity);
}
pub fn get_instances() -> Vec<Arc<Entity>> {
/*
This is about as inefficient as I could write this, but again, without
knowing your access patterns, I can't really do any better.
*/
INSTANCES
// Lock mutex
.lock().expect("INSTANCES mutex was poisoned")
// Get a borrowing iterator from the Vec.
.iter()
/*
Convert each `&Weak<Entity>` into a fresh `Arc<Entity>`. If we
couldn't (because the weak ref is dead), just drop that element.
*/
.filter_map(|weak_entity| weak_entity.upgrade())
// Collect into a new `Vec`.
.collect()
}
}
fn main() {
use entity::Entity;
let e0 = Entity::new("Entity 0".to_string());
println!("e0: {}", e0.name);
{
let e1 = Entity::new("Entity 1".to_string());
println!("e1: {}", e1.name);
/*
`e1` is dropped here, which should cause the underlying `Entity` to
stop existing, since there are no other "strong" references to it.
*/
}
let e2 = Entity::new("Entity 2".to_string());
println!("e2: {}", e2.name);
println!("Counter: {}", entity::get_counter());
println!("Instances:");
for ent in entity::get_instances() {
println!("- {}", ent.name);
}
}

Resources