A couple months ago, I started writing an X11 window manager in C with Xlib and I'm now trying to rewrite it in Rust. (after all, rust is the future, right?)
This is the code from main.rs that segfaults:
// event loop
loop {
let mut event : *mut XEvent = 0 as *mut XEvent;
unsafe {
XNextEvent(DISPLAY, event);
/*match (*event).type_ {
CreateNotify => {
println!("sdfg");
}
_ => {
println!("sdfgsdfg");
}
}*/
}
println!("dsfgsdfg");
}
Using GDB, I was able to narrow it down to XNextEvent inside the unsafe block. I'm not really sure where to go from there, since it's calling a function /use/lib/libX11.so.6 and I can't list the function to see where it quit.
Before you ask about the unsafe block, I can't put Xlib functions outside of unsafe blocks because they are unsafe functions and the compiler will complain.
The project uses the x11 crate on crates.io.
You are passing a pointer to NULL instead of a pointer to a local variable to be filled by XNextEvent, just as if your C code did:
XEvent *event = NULL;
XNextEvent(DISPLAY, event); //crash!
In C, you are probably doing:
XEvent event;
XNextEvent(DISPLAY, &event);
But unfortunately you cannot do that directly in Rust because you cannot declare an uninitialized variable:
let mut event: XEvent;
XNextEvent(DISPLAY, &mut event); //error: event is not initialized
If XEvent implemented Default you could just default initialize it, but I think it does not.
For these cases you can use MaybeUninit:
let event = unsafe {
let mut event = MaybeUninit::uninit();
XNextEvent(DISPLAY, event.as_mut_ptr());
event.assume_init()
};
Related
I would like to call the C function libusb1_sys::libusb_set_pollfd_notifiers() from Rust, providing callbacks and context. The C function is:
void libusb_set_pollfd_notifiers(
libusb_context * ctx,
libusb_pollfd_added_cb added_cb,
libusb_pollfd_removed_cb removed_cb,
void * user_data
);
This follows the traditional method for "closures" in C where user_data provides context for the callback.
The following code compiles and runs, but as far as I can tell is wrong:
use libusb1_sys::libusb_pollfd;
use rusb::UsbContext;
extern "system" fn usb_source_added_cb(
_fd: libc::c_int,
_events: libc::c_short,
_user_data: *mut libc::c_void,
) {
println!("USB source added");
}
extern "system" fn usb_source_removed_cb(
_fd: libc::c_int,
_user_data: *mut libc::c_void,
) {
println!("USB source removed");
}
struct UsbFdChannel {
sender: std::sync::mpsc::Sender<libc::c_int>,
context: rusb::Context,
}
impl UsbFdChannel {
fn new() -> (Self, std::sync::mpsc::Receiver<libc::c_int>) {
let context = rusb::Context::new().unwrap();
// This gives me a `*mut libusb_context`
let raw_context = context.as_raw();
let (mut sender, receiver) = std::sync::mpsc::channel();
unsafe {
libusb1_sys::libusb_set_pollfd_notifiers(
raw_context,
usb_source_added_cb,
usb_source_removed_cb,
&mut sender as *mut _ as *mut libc::c_void,
);
}
(Self { sender, context }, receiver)
}
}
fn main() {
let _usb_channel = UsbFdChannel::new();
}
I think it's wrong because once UsbFdChannel::new() returns, the mut * to sender is no longer valid, but libusb's context is still able to use it. Am I correct?
What I would like is for this incorrect code to not compile. If possible, I'd like this entire class of incorrect code to not compile. I think what I want is to associate a lifetime with sender, or possibly with &mut sender, that the compiler will then enforce. Or maybe I want to "trick" rust into thinking that the lifetime of *mut sender is bounded by that of &mut sender...?
If I achieve that, then I can resolve the incorrectness however I can - by having UsbFdChannel take a reference, or create a reference counted/boxed/both thing, or whatever. The important thing to me (for both solving this particular problem and for learning how to write better Rust) is that I can identify and declare the ownership problem first, and then solve that problem second.
I considered using std::marker::PhantomData, but since UsbFdChannel is not parameterised by a type, I don't see how to tell it what member the PhantomData's lifetime would correspond to. I also considered using some kind of non-stack allocation and reference counting, but I still want to know (inasmuch as Rust can tell me) that I've solved the problem. How do I do that?
This question already has an answer here:
How can I call a raw address from Rust?
(1 answer)
Closed 3 years ago.
Hello people of the internet,
I'm struggeling to invoke a function that is stored in a libc::c_void-Pointer. I can't tell Rust that the pointer is invokable and I can't figure out how to.
I want to translate this C++ Code
void * malloc(size_t size) {
static void *(*real_malloc)(size_t) = nullptr;
if (real_malloc == nullptr) {
real_malloc = reinterpret_cast<void *(*)(size_t)> (dlsym(RTLD_NEXT, "malloc"));
}
// do some logging stuff
void * ptr = real_malloc(size);
return ptr;
}
to Rust.
#[no_mangle]
pub extern fn malloc(bytes: usize) {
let c_string = "malloc\0".as_mut_ptr() as *mut i8; // char array for libc
let real_malloc: *mut libc::c_void = libc::dlsym(libc::RTLD_NEXT, c_string);
return real_malloc(bytes);
}
That's my progress so far after 1h of searching on the internet and trying. I'm new to Rust and not yet familiar with Rust/FFI / Rust with libc. I tried a lot with unsafe{}, casts with as but I always stuck at the following problem:
return real_malloc(bytes);
^^^^^^^^^^^^^^^^^^ expected (), found *-ptr
Q1: How can I call the function behind the void-Pointer stored in real_malloc?
Q2: Is my Rust-String to C-String conversion feasible this way?
I figured it out! Perhaps there is a better way but it works.
The trick is to "cast" the void-Pointer to c-function-Type with std::mem::transmute since it won't work with as
type LibCMallocT = fn(usize) -> *mut libc::c_void;
// C-Style string for symbol-name
let c_string = "malloc\0".as_ptr() as *mut i8; // char array for libc
// Void-Pointer to address of symbol
let real_malloc_addr: *mut libc::c_void = unsafe {libc::dlsym(libc::RTLD_NEXT, c_string)};
// transmute: "Reinterprets the bits of a value of one type as another type"
// Transform void-pointer-type to callable C-Function
let real_malloc: LibCMallocT = unsafe { std::mem::transmute(real_malloc_addr) }
When the shared object is build, one can verify that it works like this:
LD_PRELOAD=./target/debug/libmalloc_log_lib.so some-binary
My full code:
extern crate libc;
use std::io::Write;
const MSG: &str = "HELLO WORLD\n";
type LibCMallocT = fn(usize) -> *mut libc::c_void;
#[no_mangle] // then "malloc" is the symbol name so that ELF-Files can find it (if this lib is preloaded)
pub extern fn malloc(bytes: usize) -> *mut libc::c_void {
/// Disable logging aka return immediately the pointer from the real malloc (libc malloc)
static mut RETURN_IMMEDIATELY: bool = false;
// C-Style string for symbol-name
let c_string = "malloc\0".as_ptr() as *mut i8; // char array for libc
// Void-Pointer to address of symbol
let real_malloc_addr: *mut libc::c_void = unsafe {libc::dlsym(libc::RTLD_NEXT, c_string)};
// transmute: "Reinterprets the bits of a value of one type as another type"
// Transform void-pointer-type to callable C-Function
let real_malloc: LibCMallocT = unsafe { std::mem::transmute(real_malloc_addr) };
unsafe {
if !RETURN_IMMEDIATELY {
// let's do logging and other stuff that potentially
// needs malloc() itself
// This Variable prevent infinite loops because 'std::io::stdout().write_all'
// also uses malloc itself
// TODO: Do proper synchronisazion
// (lock whole method? thread_local variable?)
RETURN_IMMEDIATELY = true;
match std::io::stdout().write_all(MSG.as_bytes()) {
_ => ()
};
RETURN_IMMEDIATELY = false
}
}
(real_malloc)(bytes)
}
PS: Thanks to https://stackoverflow.com/a/46134764/2891595 (After I googled a lot more I found the trick with transmute!)
I want to do computations on a large set of data each frame of my web app. Only a subset of this will be used by JavaScript, so instead of sending the entire set of data back and forth between WebAssembly and JavaScript each frame, it would be nice if the data was maintained internally in my WebAssembly module.
In C, something like this works:
#include <emscripten/emscripten.h>
int state = 0;
void EMSCRIPTEN_KEEPALIVE inc() {
state++;
}
int EMSCRIPTEN_KEEPALIVE get() {
return state;
}
Is the same thing possible in Rust? I tried doing it with a static like this:
static mut state: i32 = 0;
pub fn main() {}
#[no_mangle]
pub fn add() {
state += 1;
}
#[no_mangle]
pub fn get() -> i32 {
state
}
But it seems static variables cannot be mutable.
Francis Gagné is absolutely correct that global variables generally make your code worse and you should avoid them.
However, for the specific case of WebAssembly as it is today, we don't have to worry about this concern:
if you have multiple threads
We can thus choose to use mutable static variables, if we have a very good reason to do so:
// Only valid because we are using this in a WebAssembly
// context without threads.
static mut STATE: i32 = 0;
#[no_mangle]
pub extern fn add() {
unsafe { STATE += 1 };
}
#[no_mangle]
pub extern fn get() -> i32 {
unsafe { STATE }
}
We can see the behavior with this NodeJS driver program:
const fs = require('fs-extra');
fs.readFile(__dirname + '/target/wasm32-unknown-unknown/release/state.wasm')
.then(bytes => WebAssembly.instantiate(bytes))
.then(({ module, instance }) => {
const { get, add } = instance.exports;
console.log(get());
add();
add();
console.log(get());
});
0
2
error[E0133]: use of mutable static requires unsafe function or block
In general, accessing mutable global variables is unsafe, which means that you can only do it in an unsafe block. With mutable global variables, it's easy to accidentally create dangling references (think of a reference to an item of a global mutable Vec), data races (if you have multiple threads – Rust doesn't care that you don't actually use threads) or otherwise invoke undefined behavior.
Global variables are usually not the best solution to a problem because it makes your software less flexible and less reusable. Instead, consider passing the state explicitly (by reference, so you don't need to copy it) to the functions that need to operate on it. This lets the calling code work with multiple independent states.
Here's an example of allocating unique state and modifying that:
type State = i32;
#[no_mangle]
pub extern fn new() -> *mut State {
Box::into_raw(Box::new(0))
}
#[no_mangle]
pub extern fn free(state: *mut State) {
unsafe { Box::from_raw(state) };
}
#[no_mangle]
pub extern fn add(state: *mut State) {
unsafe { *state += 1 };
}
#[no_mangle]
pub extern fn get(state: *mut State) -> i32 {
unsafe { *state }
}
const fs = require('fs-extra');
fs.readFile(__dirname + '/target/wasm32-unknown-unknown/release/state.wasm')
.then(bytes => WebAssembly.instantiate(bytes))
.then(({ module, instance }) => {
const { new: newFn, free, get, add } = instance.exports;
const state1 = newFn();
const state2 = newFn();
add(state1);
add(state2);
add(state1);
console.log(get(state1));
console.log(get(state2));
free(state1);
free(state2);
});
2
1
Note — This currently needs to be compiled in release mode to work. Debugging mode has some issues at the moment.
Admittedly, this is not less unsafe because you're passing raw pointers around, but it makes it clearer in the calling code that there is some mutable state being manipulated. Also note that it is now the responsibility of the caller to ensure that the state pointer is being handled correctly.
I am trying to write a program that spawns a bunch of threads and then joins the threads at the end. I want it to be interruptible, because my plan is to make this a constantly running program in a UNIX service.
The idea is that worker_pool will contain all the threads that have been spawned, so terminate can be called at any time to collect them.
I can't seem to find a way to utilize the chan_select crate to do this, because this requires I spawn a thread first to spawn my child threads, and once I do this I can no longer use the worker_pool variable when joining the threads on interrupt, because it had to be moved out for the main loop. If you comment out the line in the interrupt that terminates the workers, it compiles.
I'm a little frustrated, because this would be really easy to do in C. I could set up a static pointer, but when I try and do that in Rust I get an error because I am using a vector for my threads, and I can't initialize to an empty vector in a static. I know it is safe to join the workers in the interrupt code, because execution stops here waiting for the signal.
Perhaps there is a better way to do the signal handling, or maybe I'm missing something that I can do.
The error and code follow:
MacBook8088:video_ingest pjohnson$ cargo run
Compiling video_ingest v0.1.0 (file:///Users/pjohnson/projects/video_ingest)
error[E0382]: use of moved value: `worker_pool`
--> src/main.rs:30:13
|
24 | thread::spawn(move || run(sdone, &mut worker_pool));
| ------- value moved (into closure) here
...
30 | worker_pool.terminate();
| ^^^^^^^^^^^ value used here after move
<chan macros>:42:47: 43:23 note: in this expansion of chan_select! (defined in <chan macros>)
src/main.rs:27:5: 35:6 note: in this expansion of chan_select! (defined in <chan macros>)
|
= note: move occurs because `worker_pool` has type `video_ingest::WorkerPool`, which does not implement the `Copy` trait
main.rs
#[macro_use]
extern crate chan;
extern crate chan_signal;
extern crate video_ingest;
use chan_signal::Signal;
use video_ingest::WorkerPool;
use std::thread;
use std::ptr;
///
/// Starts processing
///
fn main() {
let mut worker_pool = WorkerPool { join_handles: vec![] };
// Signal gets a value when the OS sent a INT or TERM signal.
let signal = chan_signal::notify(&[Signal::INT, Signal::TERM]);
// When our work is complete, send a sentinel value on `sdone`.
let (sdone, rdone) = chan::sync(0);
// Run work.
thread::spawn(move || run(sdone, &mut worker_pool));
// Wait for a signal or for work to be done.
chan_select! {
signal.recv() -> signal => {
println!("received signal: {:?}", signal);
worker_pool.terminate(); // <-- Comment out to compile
},
rdone.recv() => {
println!("Program completed normally.");
}
}
}
fn run(sdone: chan::Sender<()>, worker_pool: &mut WorkerPool) {
loop {
worker_pool.ingest();
worker_pool.terminate();
}
}
lib.rs
extern crate libc;
use std::thread;
use std::thread::JoinHandle;
use std::os::unix::thread::JoinHandleExt;
use libc::pthread_join;
use libc::c_void;
use std::ptr;
use std::time::Duration;
pub struct WorkerPool {
pub join_handles: Vec<JoinHandle<()>>
}
impl WorkerPool {
///
/// Does the actual ingestion
///
pub fn ingest(&mut self) {
// Use 9 threads for an example.
for i in 0..10 {
self.join_handles.push(
thread::spawn(move || {
// Get the videos
println!("Getting videos for thread {}", i);
thread::sleep(Duration::new(5, 0));
})
);
}
}
///
/// Joins all threads
///
pub fn terminate(&mut self) {
println!("Total handles: {}", self.join_handles.len());
for handle in &self.join_handles {
println!("Joining thread...");
unsafe {
let mut state_ptr: *mut *mut c_void = 0 as *mut *mut c_void;
pthread_join(handle.as_pthread_t(), state_ptr);
}
}
self.join_handles = vec![];
}
}
terminate can be called at any time to collect them.
I don't want to stop the threads; I want to collect them with join. I agree stopping them would not be a good idea.
These two statements don't make sense to me. You can only join a thread when it's complete. The word "interruptible" and "at any time" would mean that you could attempt to stop a thread while it is still doing some processing. Which behavior do you want?
If you want to be able to stop a thread that has partially completed, you have to enhance your code to check if it should exit early. This is usually complicated by the fact that you are doing some big computation that you don't have control over. Ideally, you break that up into chunks and check your exit flag frequently. For example, with video work, you could check every frame. Then the response delay is roughly the time to process a frame.
this would be really easy to do in C.
This would be really easy to do incorrectly. For example, the code currently presented attempts to perform mutation to the pool from two different threads without any kind of synchronization. That's a sure-fire recipe to make broken, hard-to-debug code.
// Use 9 threads for an example.
0..10 creates 10 threads.
Anyway, it seems like the missing piece of knowledge is Arc and Mutex. Arc allows sharing ownership of a single item between threads, and Mutex allows for run-time mutable borrowing between threads.
#[macro_use]
extern crate chan;
extern crate chan_signal;
use chan_signal::Signal;
use std::thread::{self, JoinHandle};
use std::sync::{Arc, Mutex};
fn main() {
let worker_pool = Arc::new(Mutex::new(WorkerPool::new()));
let signal = chan_signal::notify(&[Signal::INT, Signal::TERM]);
let (work_done_tx, work_done_rx) = chan::sync(0);
let worker_pool_clone = worker_pool.clone();
thread::spawn(move || run(work_done_tx, worker_pool_clone));
// Wait for a signal or for work to be done.
chan_select! {
signal.recv() -> signal => {
println!("received signal: {:?}", signal);
let mut pool = worker_pool.lock().expect("Unable to lock the pool");
pool.terminate();
},
work_done_rx.recv() => {
println!("Program completed normally.");
}
}
}
fn run(_work_done_tx: chan::Sender<()>, worker_pool: Arc<Mutex<WorkerPool>>) {
loop {
let mut worker_pool = worker_pool.lock().expect("Unable to lock the pool");
worker_pool.ingest();
worker_pool.terminate();
}
}
pub struct WorkerPool {
join_handles: Vec<JoinHandle<()>>,
}
impl WorkerPool {
pub fn new() -> Self {
WorkerPool {
join_handles: vec![],
}
}
pub fn ingest(&mut self) {
self.join_handles.extend(
(0..10).map(|i| {
thread::spawn(move || {
println!("Getting videos for thread {}", i);
})
})
)
}
pub fn terminate(&mut self) {
for handle in self.join_handles.drain(..) {
handle.join().expect("Unable to join thread")
}
}
}
Beware that the program logic itself is still poor; even though an interrupt is sent, the loop in run continues to execute. The main thread will lock the mutex, join all the current threads1, unlock the mutex and exit the program. However, the loop can lock the mutex before the main thread has exited and start processing some new data! And then the program exits right in the middle of processing. It's almost the same as if you didn't handle the interrupt at all.
1: Haha, tricked you! There are no running threads at that point. Since the mutex is locked for the entire loop, the only time another lock can be made is when the loop is resetting. However, since the last instruction in the loop is to join all the threads, there won't be anymore running.
I don't want to let the program terminate before all threads have completed.
Perhaps it's an artifact of the reduced problem, but I don't see how the infinite loop can ever exit, so the "I'm done" channel seems superfluous.
I'd probably just add a flag that says "please stop" when an interrupt is received. Then I'd check that instead of the infinite loop and wait for the running thread to finish before exiting the program.
use std::sync::atomic::{AtomicBool, Ordering};
fn main() {
let worker_pool = WorkerPool::new();
let signal = chan_signal::notify(&[Signal::INT, Signal::TERM]);
let please_stop = Arc::new(AtomicBool::new(false));
let threads_please_stop = please_stop.clone();
let runner = thread::spawn(|| run(threads_please_stop, worker_pool));
// Wait for a signal
chan_select! {
signal.recv() -> signal => {
println!("received signal: {:?}", signal);
please_stop.store(true, Ordering::SeqCst);
},
}
runner.join().expect("Unable to join runner thread");
}
fn run(please_stop: Arc<AtomicBool>, mut worker_pool: WorkerPool) {
while !please_stop.load(Ordering::SeqCst) {
worker_pool.ingest();
worker_pool.terminate();
}
}
I have a struct:
struct MyData {
x: i32
}
I want to asynchronously start a long operation on this struct.
My first attempt was this:
fn foo(&self) { //should return immediately
std::thread::Thread::spawn(move || {
println!("{:?}",self.x); //consider a very long operation
});
}
Clearly the compiler cannot infer an appropriate lifetime due to conflicting requirements because self may be on the stack frame and thus cannot be guaranteed to exist by the time the operation is running on a different stack frame.
To solve this, I attempted to make a copy of self and provide that copy to the new thread:
fn foo(&self) { //should return immediately
let clone = self.clone();
std::thread::Thread::spawn(move || {
println!("{:?}",clone.x); //consider a very long operation
});
}
I think that does not compile because now clone is on the stack frame which is similar to before. I also tried to do the clone inside the thread, and that does not compile either, I think for similar reasons.
Then I decided maybe I could use a channel to push the copied data into the thread, on the theory that perhaps channel can magically move (copy?) stack-allocated data between threads, which is suggested by this example in the documentation. However the compiler cannot infer a lifetime for this either:
fn foo(&self) { //should return immediately
let (tx, rx) = std::sync::mpsc::channel();
tx.send(self.clone());
std::thread::Thread::spawn(move || {
println!("{:?}",rx.recv().unwrap().x); //consider a very long operation
});
}
Finally, I decided to just copy my struct onto the heap explicitly, and pass an Arc into the thread. But not even here can the compiler figure out a lifetime:
fn foo(&self) { //should return immediately
let arc = std::sync::Arc::new(self.clone());
std::thread::Thread::spawn(move || {
println!("{:?}",arc.clone().x); //consider a very long operation
});
}
Okay borrow checker, I give up. How do I get a copy of self onto my new thread?
I think your issue is simply because your structure does not derive the Clone trait. You can get your second example to compile and run by adding a #[derive(Clone)] before your struct's definition.
What I don't understand in the compiler behaviour here is what .clone() function it tried to use here. Your structure indeed did not implement the Clone trait so should not by default have a .clone() function.
playpen
You may also want to consider in your function taking self by value, and let your caller decide whether it should make a clone, or just a move.
As an alternative solution, you could use thread::scoped and maintain a handle to the thread. This allows the thread to hold a reference, without the need to copy it in:
#![feature(old_io,std_misc)]
use std::thread::{self,JoinGuard};
use std::old_io::timer;
use std::time::duration::Duration;
struct MyData {
x: i32,
}
// returns immediately
impl MyData {
fn foo(&self) -> JoinGuard<()> {
thread::scoped(move || {
timer::sleep(Duration::milliseconds(300));
println!("{:?}", self.x); //consider a very long operation
timer::sleep(Duration::milliseconds(300));
})
}
}
fn main() {
let d = MyData { x: 42 };
let _thread = d.foo();
println!("I'm so fast!");
}