How can I work around not being able to export functions with lifetimes when using wasm-bindgen? - rust

I'm trying to write a simple game that runs in the browser, and I'm having a hard time modeling a game loop given the combination of restrictions imposed by the browser, rust, and wasm-bindgen.
A typical game loop in the browser follows this general pattern:
function mainLoop() {
update();
draw();
requestAnimationFrame(mainLoop);
}
If I were to model this exact pattern in rust/wasm-bindgen, it would look like this:
let main_loop = Closure::wrap(Box::new(move || {
update();
draw();
window.request_animation_frame(main_loop.as_ref().unchecked_ref()); // Not legal
}) as Box<FnMut()>);
Unlike javascript, I'm unable to reference main_loop from within itself, so this doesn't work.
An alternative approach that someone suggested is to follow the pattern illustrated in the game of life example. At a high-level, it involves exporting a type that contains the game state and includes public tick() and render() functions that can be called from within a javascript game loop. This doesn't work for me because my gamestate requires lifetime parameters, since it effectively just wraps a specs World and Dispatcher struct, the latter of which has lifetime parameters. Ultimately, this means that I can't export it using #[wasm_bindgen].
I'm having a hard time finding ways to work around these restrictions, and am looking for suggestions.

The easiest way to model this is likely to leave invocations of requestAnimationFrame to JS and instead just implement the update/draw logic in Rust.
In Rust, however, what you can also do is to exploit the fact that a closure which doesn't actually capture any variables is zero-size, meaning that Closure<T> of that closure won't allocate memory and you can safely forget it. For example something like this should work:
#[wasm_bindgen]
pub fn main_loop() {
update();
draw();
let window = ...;
let closure = Closure::wrap(Box::new(|| main_loop()) as Box<Fn()>);
window.request_animation_frame(closure.as_ref().unchecked_ref());
closure.forget(); // not actually leaking memory
}
If your state has lifetimes inside of it, that is unfortunately incompatible with returning back to JS because when you return all the way back to the JS event loop then all WebAssembly stack frames have been popped, meaning that any lifetime is invalidated. This means that your game state persisted across iterations of the main_loop will need to be 'static

I'm a Rust novice, but here's how I addressed the same issue.
You can eliminate the problematic window.request_animation_frame recursion and implement an FPS cap at the same time by invoking window.request_animation_frame from a window.set_interval callback which checks a Rc<RefCell<bool>> or something to see if there's an animation frame request still pending. I'm not sure if the inactive tab behavior will be any different in practice.
I put the bool into my application state since I'm using an Rc<RefCell<...>> to that anyway for other event handling. I haven't checked that this below compiles as is, but here's the relevant parts of how I'm doing this:
pub struct MyGame {
...
should_request_render: bool, // Don't request another render until the previous runs, init to false since we'll fire the first one immediately.
}
...
let window = web_sys::window().expect("should have a window in this context");
let application_reference = Rc::new(RefCell::new(MyGame::new()));
let request_animation_frame = { // request_animation_frame is not forgotten! Its ownership is moved into the timer callback.
let application_reference = application_reference.clone();
let request_animation_frame_callback = Closure::wrap(Box::new(move || {
let mut application = application_reference.borrow_mut();
application.should_request_render = true;
application.handle_animation_frame(); // handle_animation_frame being your main loop.
}) as Box<FnMut()>);
let window = window.clone();
move || {
window
.request_animation_frame(
request_animation_frame_callback.as_ref().unchecked_ref(),
)
.unwrap();
}
};
request_animation_frame(); // fire the first request immediately
let timer_closure = Closure::wrap(
Box::new(move || { // move both request_animation_frame and application_reference here.
let mut application = application_reference.borrow_mut();
if application.should_request_render {
application.should_request_render = false;
request_animation_frame();
}
}) as Box<FnMut()>
);
window.set_interval_with_callback_and_timeout_and_arguments_0(
timer_closure.as_ref().unchecked_ref(),
25, // minimum ms per frame
)?;
timer_closure.forget(); // this leaks it, you could store it somewhere or whatever, depends if it's guaranteed to live as long as the page
You can store the result of set_interval and the timer_closure in Options in your game state so that your game can clean itself up if needed for some reason (maybe? I haven't tried this, and it would seem to cause a free of self?). The circular reference won't erase itself unless broken (you're then storing Rcs to the application inside the application effectively). It should also enable you to change the max fps while running, by stopping the interval and creating another using the same closure.

Related

How to leverage concurrency for processing large array of data in rust?

I am a javascript developer and I am building a desktop application using Tauri. I am used to single-threaded languages and trying to get comfortable with the concept of concurrency.
My problem can be summarized as follows. I receive a JSON array from the backend of length 500. I loop through this array and perform some asynchronous operations like making a network call. In the end, I aggregate, structure, and return the data. This entire process takes around 25-35 seconds on my machine.
I wanted to leverage the concept of concurrency to reduce the time required for this operation. One possible solution that I thought of, was to create n number of threads, let's say 8, and parallelly process the data.
async fun main(){
// creating variables which will hold final structured data
let app_data;
let app_to_bundle_map;
// create 8 threads
let thread_1 = thread::spawn(|| {})
let thread_2 = thread::spawn(|| {})
//.. so on
for i in 1..500{
let thread_assigned = i % 8
match thread_assigned {
1 => {
// process_data() on thread 1 and insert into app_data & app_to_bundle_map.
// But how do I assign process_data() to the thread's closure?
// How do I make sure thread 1 is available for use?
}
2 => {
// process_data() on thread 2 and insert into app_data & app_to_bundle_map
}
_ => {
// process_data() on thread _ and insert into app_data & app_to_bundle_map
}
}
}
}
fun process_data(item: String, data: &Data){
// perform some heavy operations like
make_network_call(item.url)
// perform more operations and modify the function argument
more_processing()
}
async fun make_a_network_call(url: String) -> String{
let client = reqwest::Client::builder().build().unwrap();
let mut _res: Result<Response, Error> = client.get(url).send().await;
match _res{
OK(_res) => {
// return response
}
Err(_res) => {
format!(r#"{{"error": "{placeholder}"}}"#, placeholder = _res.to_string())
}
}
}
Another option I thought of was dividing my data of size 500 into 8 parts and then processing them parallelly. Is this a better approach? Or are both the approaches wrong, if so, what do you suggest is the correct method to solve such problems in rust? Overall, my final goal is to reduce the time from 25-35 seconds to less than 10 seconds. Looking forward to everybody's insights. Thank you in advance.
Concurrency is hard [citation needed]. What you are trying to do here is manually handling it. That of course could work, but it will be a pain, especially if you are a beginner. Luckily there are fantastic libraries already out there that provide handling concurrency part for you. You should looking if they already provide what you need.From your description I am not quite sure if you are CPU or IO bound.
If you are CPU bound you should look at rayon crate. It allows to easily iterate in parallel over some iterator.
If you are IO bound you should look at async rust. There are many libraries, that allow to do many things, but I would recommend tokio to begin with. It is production-ready and has great emphasis put on networking. You would need however to learn a bit about async rust, as it requires different thinking model than normal synchronous code.
And regardless of which one you choose you should familiarize yourself with channels. They are a great and easy tool for passing data around, including from one thread to another.

Add heap-allocated string to Panic handler

I am in the context of a web application, where each request is assigned a unique correlation ID.
I am running in a wasm environment with the wasm32-unknown-unknown target. One request is always served by one thread, and the entire environment is torn down afterwards.
I would like to register a panic handler that if a request panics, it also logs this request ID.
This has proven to be difficult, as anything that has to go into the set_hook method needs the 'static lifetime constraint, which a request ID obviously doesn't have.
I would like code along the following lines to compile
// Assume we have a request here from somewhere.
let request = get_request_from_framework();
// This is at the start of the request
panic::set_hook(Box::new(|info| {
let request_id = request.get_request_id();
// Log panic messages here with request_id here
}));
Potential solutions
I have a few potential approaches. I am not sure which one is best, or if there are any approaches that I am missing.
1. Leaking the memory
As I know my environment is torn down after each request, one way to get a String moved into the 'static lifetime to leak it is like this
let request_id = uuid::Uuid::new_v4().to_string();
let request_id: &'static str = Box::leak(request_id.into_boxed_str());
request_id
This will work in practice, as the request id is theoretically 'static (as after the request is served, the application is closed) - however it has the disadvantage that if I ever move this code into a non-wasm environment, we'll end up leaking memory pretty quickly.
2. Threadlocal
As I know that each request is served by one thread, I could stuff the request id into a ThreadLocal, and read from that ThreadLocal on panics.
pub fn create_request_id() -> &'static str {
let request_id = uuid::Uuid::new_v4().to_string();
CURRENT_REQUEST_ID.with(|current_request_id| {
*current_request_id.borrow_mut() = request_id;
});
}
thread_local! {
pub static CURRENT_REQUEST_ID: RefCell<String> = RefCell::new(uuid::Uuid::new_v4().to_string());
}
// And then inside the panic handler get the request_id with something like
let request_id = CURRENT_REQUEST_ID.with(|current_request_id| {
let current_request_id = current_request_id.try_borrow();
match current_request_id {
Ok(current_request_id) => current_request_id.clone(),
Err(err) => "Unknown".to_string(),
}
});
This seems like the "best" solution I can come up with. However I'm not sure what the perf. implications are of initializing a ThreadLocal on each request is, particularly because we panic extremely rarely, so I'd hate to pay a big cost up-front for something I almost never use.
3. Catch_unwind
I experimented with the catch_unwind API, as that seemed like a good choice. I would then wrap the handling of each request with catch_unwind. However it seems like wasm32-unknown-unknown currently doesn't respect catch_unwind
What is the best solution here? Is there any way to get something that's heap-allocated into a Rust panic hook that I'm not aware of?
As per your example, you could move the id into the clusure:
// Assume we have a request here from somewhere.
let request = get_request_from_framework();
let request_id = request.get_request_id();
// This is at the start of the request
panic::set_hook(Box::new(move |info| {
let panic_message = format!("Request {} failed", request_id);
// Log panic messages here with request_id here
}));
Playground

Error calling a Lua function from Rust: `*mut rlua::ffi::lua_State` cannot be shared between threads safely

I am developing a CLI program for rendering template files using the new MiniJinja library by mitsuhiko.
The program is here: https://github.com/benwilber/temple.
I would like to be able to extend the program by allowing the user to load custom Lua scripts for things like custom filters, functions, and tests. However, I am running into Rust lifetime errors that I've not been able to solve.
Basically, I would like to be able to register a Lua function as a custom filter function. But it's showing an error when compiling. Here is the code:
https://github.com/benwilber/temple/compare/0.3.1..lua
Error:
https://gist.github.com/c649a0b240cf299d3dbbe018c24cbcdc
How can I call a Lua function from the MiniJinja add_filter function? I would prefer to try to do this in the regular/safe way. But I'm open to unsafe alternatives if required.
Thanks!
Edit: Posted the same on Reddit and users.rust-lang.org
Lua uses state that is not safe to use from more than one thread.
A consequence of this is that LuaFunction is neither Sync or Send.
This is being enforced by this part of the error message:
help: within `LuaFunction<'_>`, the trait `Sync` is not implemented for `*mut rlua::ffi::lua_State`
In contrast a minijinja::Filter must implement Send + Sync + 'static.
(See https://docs.rs/minijinja/0.5.0/minijinja/filters/trait.Filter.html)
This means we can't share LuaFunctions (or even LuaContext) between calls to the Filters.
One option is to not pass your lua state into the closures, and instead create a new lua state every call, something like this.
env.add_filter(
"concat2",
|_env: &Environment, s1: String, s2: String|
-> anyhow::Result<String, minijinja::Error> {
lua.context(|lua_ctx| {
lua_ctx.load(include_str!("temple.lua")).exec().unwrap();
let globals = lua_ctx.globals();
let temple: rlua::Table = globals.get("temple").unwrap();
let filters: rlua::Table = temple.get("_filters").unwrap();
let concat2: rlua::Function = filters.get("concat2").unwrap();
let res: String = concat2.call::<_, String>((s1, s2)).unwrap();
Ok(res)
}
}
);
This is likely to have relatively high overhead.
Another option is to create your rlua state in one thread and communicate with it via pipes. This would look more like this:
pub fn test() {
let mut env = minijinja::Environment::new();
let (to_lua_tx, to_lua_rx) = channel::<(String,String,SyncSender<String>)>();
thread::spawn(move|| {
let lua = rlua::Lua::new();
lua.context(move |lua_ctx| {
lua_ctx.load("some_code").exec().unwrap();
let globals = lua_ctx.globals();
let temple: rlua::Table = globals.get("temple").unwrap();
let filters: rlua::Table = temple.get("_filters").unwrap();
let concat2: rlua::Function = filters.get("concat2").unwrap();
while let Ok((s1,s2, channel)) = to_lua_rx.recv() {
let res: String = concat2.call::<_, String>((s1, s2)).unwrap();
channel.send(res).unwrap()
}
})
});
let to_lua_tx = Mutex::new(to_lua_tx);
env.add_filter(
"concat2",
move |_env: &minijinja::Environment,
s1: String,
s2: String|
-> anyhow::Result<String, minijinja::Error> {
let (tx,rx) = sync_channel::<String>(0);
to_lua_tx.lock().unwrap().send((s1,s2,tx)).unwrap();
let res = rx.recv().unwrap();
Ok(res)
}
);
}
It would even be possible to start multiple lua states this way, but would require a bit more plumbing.
DISCLAIMER: This code is all untested - however, it builds with a stubbed version of minijinja and rlua in the playground. You probably want better error handling and might need some additional code to handle cleanly shutting down all the threads.

Rust: Safe multi threading with recursion

I'm new to Rust.
For learning purposes, I'm writing a simple program to search for files in Linux, and it uses a recursive function:
fn ffinder(base_dir:String, prmtr:&'static str, e:bool, h:bool) -> std::io::Result<()>{
let mut handle_vec = vec![];
let pth = std::fs::read_dir(&base_dir)?;
for p in pth {
let p2 = p?.path().clone();
if p2.is_dir() {
if !h{ //search doesn't include hidden directories
let sstring:String = get_fname(p2.display().to_string());
let slice:String = sstring[..1].to_string();
if slice != ".".to_string() {
let handle = thread::spawn(move || {
ffinder(p2.display().to_string(),prmtr,e,h);
});
handle_vec.push(handle);
}
}
else {//search include hidden directories
let handle2 = thread::spawn(move || {
ffinder(p2.display().to_string(),prmtr,e,h);
});
handle_vec.push(handle2);
}
}
else {
let handle3 = thread::spawn(move || {
if compare(rmv_underline(get_fname(p2.display().to_string())),rmv_underline(prmtr.to_string()),e){
println!("File found at: {}",p2.display().to_string().blue());
}
});
handle_vec.push(handle3);
}
}
for h in handle_vec{
h.join().unwrap();
}
Ok(())
}
I've tried to use multi threading (thread::spawn), however it can create too many threads, exploding the OS limit, which breaks the program execution.
Is there a way to multi thread with recursion, using a safe,limited (fixed) amount of threads?
As one of the commenters mentioned, this is an absolutely perfect case for using Rayon. The blog post mentioned doesn't show how Rayon might be used in recursion, only making an allusion to crossbeam's scoped threads with a broken link. However, Rayon provides its own scoped threads implementation that solves your problem as well in that only uses as many threads as you have cores available, avoiding the error you ran into.
Here's the documentation for it:
https://docs.rs/rayon/1.0.1/rayon/fn.scope.html
Here's an example from some code I recently wrote. Basically what it does is recursively scan a folder, and each time it nests into a folder it creates a new job to scan that folder while the current thread continues. In my own tests it vastly outperforms a single threaded approach.
let source = PathBuf::from("/foo/bar/");
let (tx, rx) = mpsc::channel();
rayon::scope(|s| scan(&source, tx, s));
fn scan<'a, U: AsRef<Path>>(
src: &U,
tx: Sender<(Result<DirEntry, std::io::Error>, u64)>,
scope: &Scope<'a>,
) {
let dir = fs::read_dir(src).unwrap();
dir.into_iter().for_each(|entry| {
let info = entry.as_ref().unwrap();
let path = info.path();
if path.is_dir() {
let tx = tx.clone();
scope.spawn(move |s| scan(&path, tx, s)) // Recursive call here
} else {
// dbg!("{}", path.as_os_str().to_string_lossy());
let size = info.metadata().unwrap().len();
tx.send((entry, size)).unwrap();
}
});
}
I'm not an expert on Rayon, but I'm fairly certain the threading strategy works like this:
Rayon creates a pool of threads to match the number of logical cores you have available in your environment. The first call to the scoped function creates a job that the first available thread "steals" from the queue of jobs available. Each time we make another recursive call, it doesn't necessarily execute immediately, but it creates a new job that an idle thread can then "steal" from the queue. If all of the threads are busy, the job queue just fills up each time we make another recursive call, and each time a thread finishes its current job it steals another job from the queue.
The full code can be found here: https://github.com/1Dragoon/fcp
(Note that repo is a work in progress and the code there is currently typically broken and probably won't work at the time you're reading this.)
As an exercise to the reader, I'm more of a sys admin than an actual developer, so I also don't know if this is the ideal approach. From Rayon's documentation linked earlier:
scope() is a more flexible building block compared to join(), since a loop can be used to spawn any number of tasks without recursing
The language of that is a bit confusing. I'm not sure what they mean by "without recursing". Join seems to intend for you to already have tasks known about ahead of time and to execute them in parallel as threads become available, whereas scope seems to be more aimed at only creating jobs when you need them rather than having to know everything you need to do in advance. Either that or I'm understanding their meaning backwards.

How do I create a structure that can be used in a Rust multithreaded server?

I want to implement a simple server, used by 3 different module of my project.
These modules will send data to the server, which will save it into a file and merge these informations when these modules will finish their job.
All these informations have a timestamp (a float) and a label (a float or a string).
This is my data structure to save these informations:
pub struct Data {
file_name: String,
logs: Vec<(f32, String)>,
measures: Vec<(f32, f32)>,
statements: Vec<(f32, String)>,
}
I use socket to interact with the server.
I use also Arc to implement a Data struct and make it shareable for each of these modules.
So, when I handle the client, I verify if the message sent by the module is correct, and if it is I call a new function that process and save the message in the good data structure field (logs, measures or statements).
// Current ip address
let ip_addr: &str = &format!("{}:{}",
&ip,
port);
// Bind the current IP address
let listener = match TcpListener::bind(ip_addr) {
Ok(listener) => listener,
Err(error) => panic!("Canno't bind {}, due to error {}",
ip_addr,
error),
};
let global_data_struct = Data::new(DEFAULT_FILE.to_string());
let global_data_struct_shared = Arc::new(global_data_struct);
// Get and process streams
for stream in listener.incoming() {
let mut global_data_struct_shared_clone = global_data_struct_shared.clone();
thread::spawn(move || {
// Borrow stream
let stream = stream;
match stream {
// Get the stream value
Ok(mut stream_v) => {
let current_ip = stream_v.peer_addr().unwrap().ip();
let current_port = stream_v.peer_addr().unwrap().port();
println!("Connected with peer {}:{}", current_ip, current_port);
// PROBLEM IN handle_client!
// A get_mut from global_data_struct_shared_clone
// returns to me None, not a value - so I
// can't access to global_data_struct_shared_clone
// fields :'(
handle_client(&mut stream_v, &mut global_data_struct_shared_clone);
},
Err(_) => error!("Canno't decode stream"),
}
});
}
// Stop listening
drop(listener);
I have some problems to get a mutable reference in handle_client to process fields in global_data_struct_shared_clone, because the Arc::get_mut(global_data_struct_shared_clone) returns to me None - due to the global_data_struct_shared.clone() for each incoming request.
Can someone help me to manage correctly this structure between these 3 modules please?
The insight of Rust is that memory safety is achieved by enforcing Aliasing XOR Mutability.
Enforcing this single principle prevents whole classes of bugs: pointer/iterator invalidation (which was the goal) and also data races.
As much as possible, Rust will try to enforce this principle at compile-time; however it can also enforce it at run-time if the user opts in by using dedicated types/methods.
Arc::get_mut is such a method. An Arc (Atomic Reference Counted pointer) is specifically meant to share a reference between multiple owners, which means aliasing, and as a result disallows mutability by default; Arc::get_mut will perform a run-time check: if the pointer is actually not alias (count of 1), then it allows mutability.
However, as you realized, this is not suitable in your case since the Arc is aliased at that point in time.
So you need to turn to other types.
The simplest solution is Arc<Mutex<...>>, Arc allows sharing, Mutex allows controlled mutability, together you can share with run-time controlled mutability enforced by the Mutex.
This is coarse-grained, but might very well be sufficient.
More sophisticated approaches can use RwLock (Reader-Writer lock), more granular Mutex or even atomics; but I would advise starting with a single Mutex and see how it goes, you have to walk before you run.

Resources