Not to my preference but I'm forced to write some Rust today so I'm trying to create a Rocket instance with only one endpoint but, on that endpoint I need to access a variable that is being created during main. The variable takes a long time to be instantiated so that's why I do it there.
My problem is that I can;t find a way to pass it safely. Whatever I do, the compiler complaints about thread safety even though the library appears to be thread safe: https://github.com/brave/adblock-rust/pull/130 (commited code is found on my local instance)
This is the error tat I get:
|
18 | / lazy_static! {
19 | | static ref rules_engine: Mutex<Vec<Engine>> = Mutex::new(vec![]);
20 | | }
| |_^ `std::rc::Rc<std::cell::RefCell<lifeguard::CappedCollection<std::vec::Vec<u64>>>>` cannot be sent between threads safely
|
...and this is my code:
#![feature(proc_macro_hygiene, decl_macro)]
#[macro_use]
extern crate rocket;
use std::fs::File;
use std::io::{self, BufRead};
use std::path::Path;
use lazy_static::lazy_static;
use std::sync::Mutex;
use adblock::engine::Engine;
use adblock::lists::FilterFormat;
use rocket::request::{Form, FormError, FormDataError};
lazy_static! {
static ref rules_engine: Mutex<Vec<Engine>> = Mutex::new(vec![]);
}
fn main() {
if !Path::new("./rules.txt").exists() {
println!("rules file does not exist")
} else {
println!("loading rules");
let mut rules = vec![];
if let Ok(lines) = read_lines("./rules.txt") {
for line in lines {
if let Ok(ip) = line {
rules.insert(0, ip)
}
}
let eng = Engine::from_rules(&rules, FilterFormat::Standard);
rules_engine.lock().unwrap().push(eng);
rocket().launch();
}
}
}
#[derive(Debug, FromForm)]
struct FormInput<> {
#[form(field = "textarea")]
text_area: String
}
#[post("/", data = "<sink>")]
fn sink(sink: Result<Form<FormInput>, FormError>) -> String {
match sink {
Ok(form) => {
format!("{:?}", &*form)
}
Err(FormDataError::Io(_)) => format!("Form input was invalid UTF-8."),
Err(FormDataError::Malformed(f)) | Err(FormDataError::Parse(_, f)) => {
format!("Invalid form input: {}", f)
}
}
}
fn rocket() -> rocket::Rocket {
rocket::ignite().mount("/", routes![sink])
}
fn read_lines<P>(filename: P) -> io::Result<io::Lines<io::BufReader<File>>>
where P: AsRef<Path>, {
let file = File::open(filename)?;
Ok(io::BufReader::new(file).lines())
}
Any way of having the eng available inside the sink endpoint method?
Rc is not thread safe, even behind a mutex. It looks like Rc is used in eng.blocker.pool.pool which is a lifeguard::Pool. So no, the Engine is not thread safe (at least by default).
Fortunately, it appears that the adblock crate has a feature called "object-pooling", which enables that specific functionality. Removing that feature will (hopefully) make it thread safe.
Rocket makes it really easy to share resources between routes (and also between main or any other thread you might have spawned from main). They call their mechanism state. Check out its documentation here.
To give a short example of how it works:
You create your type that you want to share in your application and manage an instance of that type in the instance of rocket that you use for your application. In the guide they give this example:
use std::sync::atomic::AtomicUsize;
struct HitCount {
count: AtomicUsize
}
rocket::build().manage(HitCount { count: AtomicUsize::new(0) });
In a route then you access the resource like this (again from the guide):
use rocket::State;
#[get("/count")]
fn count(hit_count: &State<HitCount>) -> String {
let current_count = hit_count.count.load(Ordering::Relaxed);
format!("Number of visits: {}", current_count)
}
While I learnt rocket I needed to share a struct that contained a String, which is not thread safe per se. That means you need to wrap it into a Mutex before you can manage it with rocket.
Also, as far as I understand, only one resource of any specific type can be shared with manage. But you can just create differently named wrapper types in that case and work around that limitation.
Related
I am using tikv/raft-rs library for implementing a consensus system. This library has a RawNode object which is a thread-unsafe object. We must execute some functions on this object periodically (example), hence I use a new thread for executing.
Here are the constraints:
I need to have this object on the main-thread doesn't have this object for accessing some its internal states. (e.g.: raw_node.is_leader)
This object must be accessed on a worker thread.
Because of these constraints, this seems impossible because:
If I create this object, and move to the new thread, the main thread cannot call its state.
If I keep this object on the main thread object, I cannot use Arc<Mutex<>> because this object doesn't implement Copy method.
Here is my code:
use std::thread::JoinHandle;
use std::sync::{Arc, Mutex, mpsc};
use std::collections::{VecDeque, HashMap};
use raft::RawNode;
use std::sync::mpsc::{Receiver, Sender, TryRecvError};
use protobuf::Message as PbMessage;
use raft::eraftpb::ConfState;
use raft::storage::MemStorage;
use raft::{prelude::*, StateRole};
use regex::Regex;
use crate::proposal::Proposal;
use crate::{proposal, batch};
use std::{str, thread};
use std::time::{Instant, Duration};
use crate::batch::Mailbox;
pub struct Node {
core: Arc<Mutex<CoreNode>>
}
#[derive(Copy)]
struct CoreNode {
raft_group: RawNode<MemStorage>,
}
impl Node {
pub fn new() -> Self {
let cfg = Config {
election_tick: 10,
heartbeat_tick: 3,
..Default::default()
};
let storage = MemStorage::new();
let core = Arc::new(Mutex::new(CoreNode {
raft_group: RawNode::new(&cfg, storage).unwrap()
}));
thread::spawn(move || {
core.lock().unwrap().run();
return;
});
Node { core: core.clone() }
}
pub fn is_leader(&self) -> bool {
return self.core.lock().unwrap().raft_group.raft.state == StateRole::Leader;
}
}
impl CoreNode {
pub fn run(mut self) {}
}
When compiling, here is the error:
22 | #[derive(Copy)]
| ^^^^
23 | struct CoreNode {
24 | raft_group: RawNode<MemStorage>,
| ------------------------------- this field does not implement `Copy`
|
My question is: How can I design around this problem.
After reading the Rust book, I've decided to give it a try with Web Assembly. I'm creating a simple tracker script to practice and learn more about it. There are a couple of methods that need to access the window, navigator or cookie API. Every time I have to access any of those there are a lot of boiler plate code involved:
pub fn start() {
let window = web_sys::window().unwrap();
let document = window.document().unwrap();
let html = document.dyn_into::<web_sys::HtmlDocument>().unwrap();
let cookie = html_document.cookie().unwrap();
}
That's unpractical and bothers me. Is there a smart way to solve this? I've in fact tried to use lazy_static to have all of this in a global.rs file:
#[macro_use]
extern crate lazy_static;
use web_sys::*;
lazy_static! {
static ref WINDOW: window = {
web_sys::window().unwrap()
};
}
But the compile fails with: *mut u8 cannot be shared between threads safely`.
You could use the ? operator instead of unwrapping.
Instead of writing
pub fn start() {
let window = web_sys::window().unwrap();
let document = window.document().unwrap();
let html = document.dyn_into::<web_sys::HtmlDocument>().unwrap();
let cookie = html_document.cookie().unwrap();
}
You can write
pub fn start() -> Result<()> {
let cookie = web_sys::window()?
.document()?
.dyn_into<web_sys::HtmlDocument>()?
.cookie()?;
Ok(())
}
It's the same number of lines, but less boilerplate and for simpler cases a one-liner.
If you really don't want to return a result you can wrap the whole thing in a lambda, (or a try block if you're happy using unstable features).
pub fn start() {
let cookie = (|| Result<Cookie)> {
web_sys::window()?
.document()?
.dyn_into<web_sys::HtmlDocument>()?
.cookie()
}).unwrap();
}
if you find you don't like repeating this frequently - you can use functions
fn document() -> Result<Document> {
web_sys::window()?.document()
}
fn html() -> Result<web_sys::HtmlDocument> {
document()?.dyn_into<web_sys::HtmlDocument>()
}
fn cookie() -> Result<Cookie> {
html()?.cookie()
}
pub fn start() {
let cookie = cookie()?;
}
That's unpractical and bothers me.
Unsure what's your issue here, but if you access the same cookie again and again in your application, perhaps you can save it in a struct and just use that struct? In my recent WebAssembly project I saved some of the elements I've used in a struct and used them by passing it around.
I also think that perhaps explaining your specific use case might lead to more specific answers :)
I have an object that I know that is inside an Arc because all the instances are always Arced. I would like to be able to pass a cloned Arc of myself in a function call. The thing I am calling will call me back later on other threads.
In C++, there is a standard mixin called enable_shared_from_this. It enables me to do exactly this
class Bus : public std::enable_shared_from_this<Bus>
{
....
void SetupDevice(Device device,...)
{
device->Attach(shared_from_this());
}
}
If this object is not under shared_ptr management (the closest C++ has to Arc) then this will fail at run time.
I cannot find an equivalent.
EDIT:
Here is an example of why its needed. I have a timerqueue library. It allows a client to request an arbitrary closure to be run at some point in the future. The code is run on a dedicated thread. To use it you must pass a closure of the function you want to be executed later.
use std::time::{Duration, Instant};
use timerqueue::*;
use parking_lot::Mutex;
use std::sync::{Arc,Weak};
use std::ops::{DerefMut};
// inline me keeper cos not on github
pub struct MeKeeper<T> {
them: Mutex<Weak<T>>,
}
impl<T> MeKeeper<T> {
pub fn new() -> Self {
Self {
them: Mutex::new(Weak::new()),
}
}
pub fn save(&self, arc: &Arc<T>) {
*self.them.lock().deref_mut() = Arc::downgrade(arc);
}
pub fn get(&self) -> Arc<T> {
match self.them.lock().upgrade() {
Some(arc) => return arc,
None => unreachable!(),
}
}
}
// -----------------------------------
struct Test {
data:String,
me: MeKeeper<Self>,
}
impl Test {
pub fn new() -> Arc<Test>{
let arc = Arc::new(Self {
me: MeKeeper::new(),
data: "Yo".to_string()
});
arc.me.save(&arc);
arc
}
fn task(&self) {
println!("{}", self.data);
}
// in real use case the TQ and a ton of other status data is passed in the new call for Test
// to keep things simple here the 'container' passes tq as an arg
pub fn do_stuff(&self, tq: &TimerQueue) {
// stuff includes a async task that must be done in 1 second
//.....
let me = self.me.get().clone();
tq.queue(
Box::new(move || me.task()),
"x".to_string(),
Instant::now() + Duration::from_millis(1000),
);
}
}
fn main() {
// in real case (PDP11 emulator) there is a Bus class owning tons of objects thats
// alive for the whole duration
let tq = Arc::new(TimerQueue::new());
let test = Test::new();
test.do_stuff(&*tq);
// just to keep everything alive while we wait
let mut input = String::new();
std::io::stdin().read_line(&mut input).unwrap();
}
cargo toml
[package]
name = "tqclient"
version = "0.1.0"
edition = "2018"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
timerqueue = { git = "https://github.com/pm100/timerqueue.git" }
parking_lot = "0.11"
There is no way to go from a &self to the Arc that self is stored in. This is because:
Rust references have additional assumptions compared to C++ references that would make such a conversion undefined behavior.
Rust's implementation of Arc does not even expose the information necessary to determine whether self is stored in an Arc or not.
Luckily, there is an alternative approach. Instead of creating a &self to the value inside the Arc, and passing that to the method, pass the Arc directly to the method that needs to access it. You can do that like this:
use std::sync::Arc;
struct Shared {
field: String,
}
impl Shared {
fn print_field(self: Arc<Self>) {
let clone: Arc<Shared> = self.clone();
println!("{}", clone.field);
}
}
Then the print_field function can only be called on an Shared encapsulated in an Arc.
having found that I needed this three times in recent days I decided to stop trying to come up with other designs. Maybe poor data design as far as rust is concerned but I needed it.
Works by changing the new function of the types using it to return an Arc rather than a raw self. All my objects are arced anyway, before they were arced by the caller, now its forced.
mini util library called mekeeper
use parking_lot::Mutex;
use std::sync::{Arc,Weak};
use std::ops::{DerefMut};
pub struct MeKeeper<T> {
them: Mutex<Weak<T>>,
}
impl<T> MeKeeper<T> {
pub fn new() -> Self {
Self {
them: Mutex::new(Weak::new()),
}
}
pub fn save(&self, arc: &Arc<T>) {
*self.them.lock().deref_mut() = Arc::downgrade(arc);
}
pub fn get(&self) -> Arc<T> {
match self.them.lock().upgrade() {
Some(arc) => return arc,
None => unreachable!(),
}
}
}
to use it
pub struct Test {
me: MeKeeper<Self>,
foo:i8,
}
impl Test {
pub fn new() -> Arc<Self> {
let arc = Arc::new(Test {
me: MeKeeper::new(),
foo:42
});
arc.me.save(&arc);
arc
}
}
now when an instance of Test wants to call a function that requires it to pass in an Arc it does:
fn nargle(){
let me = me.get();
Ooddle::fertang(me,42);// fertang needs an Arc<T>
}
the weak use is what the shared_from_this does so as to prevent refcount deadlocks, I stole that idea.
The unreachable path is safe because the only place that can call MeKeeper::get is the instance of T (Test here) that owns it and that call can only happen if the T instance is alive. Hence no none return from weak::upgrade
I'm coming from mostly OOP languages, so getting this concept to work in Rust kinda seems hard. I want to implement a basic counter that keeps count of how many "instances" I've made of that type, and keep them in a vector for later use.
I've tried many different things, first was making a static vector variable, but that cant be done due to it not allowing static stuff that have destructors.
This was my first try:
struct Entity {
name: String,
}
struct EntityCounter {
count: i64,
}
impl Entity {
pub fn init() {
let counter = EntityCounter { count: 0 };
}
pub fn new(name: String) {
println!("Entity named {} was made.", name);
counter += 1; // counter variable unaccessable (is there a way to make it global to the struct (?..idek))
}
}
fn main() {
Entity::init();
Entity::new("Hello".to_string());
}
Second:
struct Entity {
name: String,
counter: i32,
}
impl Entity {
pub fn new(self) {
println!("Entity named {} was made.", self.name);
self.counter = self.counter + 1;
}
}
fn main() {
Entity::new(Entity { name: "Test".to_string() });
}
None of those work, I was just trying out some concepts on how I could be able to implement such a feature.
Your problems appear to be somewhat more fundamental than what you describe. You're kind of throwing code at the wall to see what sticks, and that's simply not going to get you anywhere. I'd recommend reading the Rust Book completely before continuing. If you don't understand something in it, ask about it. As it stands, you're demonstrating you don't understand variable scoping, return types, how instance construction works, how statics work, and how parameters are passed. That's a really shaky base to try and build any understanding on.
In this particular case, you're asking for something that's deliberately not straightforward. You say you want a counter and a vector of instances. The counter is simple enough, but a vector of instances? Rust doesn't allow easy sharing like other languages, so how you go about doing that depends heavily on what it is you're actually intending to use this for.
What follows is a very rough guess at something that's maybe vaguely similar to what you want.
/*!
Because we need the `lazy_static` crate, you need to add the following to your
`Cargo.toml` file:
```cargo
[dependencies]
lazy_static = "0.2.1"
```
*/
#[macro_use] extern crate lazy_static;
mod entity {
use std::sync::{Arc, Weak, Mutex};
use std::sync::atomic;
pub struct Entity {
pub name: String,
}
impl Entity {
pub fn new(name: String) -> Arc<Self> {
println!("Entity named {} was made.", name);
let ent = Arc::new(Entity {
name: name,
});
bump_counter();
remember_instance(ent.clone());
ent
}
}
/*
The counter is simple enough, though I'm not clear on *why* you even want
it in the first place. You don't appear to be using it for anything...
*/
static COUNTER: atomic::AtomicUsize = atomic::ATOMIC_USIZE_INIT;
fn bump_counter() {
// Add one using the most conservative ordering.
COUNTER.fetch_add(1, atomic::Ordering::SeqCst);
}
pub fn get_counter() -> usize {
COUNTER.load(atomic::Ordering::SeqCst)
}
/*
There are *multiple* ways of doing this part, and you simply haven't given
enough information on what it is you're trying to do. This is, at best,
a *very* rough guess.
`Mutex` lets us safely mutate the vector from any thread, and `Weak`
prevents `INSTANCES` from keeping every instance alive *forever*. I mean,
maybe you *want* that, but you didn't specify.
Note that I haven't written a "cleanup" function here to remove dead weak
references.
*/
lazy_static! {
static ref INSTANCES: Mutex<Vec<Weak<Entity>>> = Mutex::new(vec![]);
}
fn remember_instance(entity: Arc<Entity>) {
// Downgrade to a weak reference. Type constraint is just for clarity.
let entity: Weak<Entity> = Arc::downgrade(&entity);
INSTANCES
// Lock mutex
.lock().expect("INSTANCES mutex was poisoned")
// Push entity
.push(entity);
}
pub fn get_instances() -> Vec<Arc<Entity>> {
/*
This is about as inefficient as I could write this, but again, without
knowing your access patterns, I can't really do any better.
*/
INSTANCES
// Lock mutex
.lock().expect("INSTANCES mutex was poisoned")
// Get a borrowing iterator from the Vec.
.iter()
/*
Convert each `&Weak<Entity>` into a fresh `Arc<Entity>`. If we
couldn't (because the weak ref is dead), just drop that element.
*/
.filter_map(|weak_entity| weak_entity.upgrade())
// Collect into a new `Vec`.
.collect()
}
}
fn main() {
use entity::Entity;
let e0 = Entity::new("Entity 0".to_string());
println!("e0: {}", e0.name);
{
let e1 = Entity::new("Entity 1".to_string());
println!("e1: {}", e1.name);
/*
`e1` is dropped here, which should cause the underlying `Entity` to
stop existing, since there are no other "strong" references to it.
*/
}
let e2 = Entity::new("Entity 2".to_string());
println!("e2: {}", e2.name);
println!("Counter: {}", entity::get_counter());
println!("Instances:");
for ent in entity::get_instances() {
println!("- {}", ent.name);
}
}
I am writing a wrapper/FFI for a C library that requires a global initialization call in the main thread as well as one for destruction.
Here is how I am currently handling it:
struct App;
impl App {
fn init() -> Self {
unsafe { ffi::InitializeMyCLib(); }
App
}
}
impl Drop for App {
fn drop(&mut self) {
unsafe { ffi::DestroyMyCLib(); }
}
}
which can be used like:
fn main() {
let _init_ = App::init();
// ...
}
This works fine, but it feels like a hack, tying these calls to the lifetime of an unnecessary struct. Having the destructor in a finally (Java) or at_exit (Ruby) block seems theoretically more appropriate.
Is there some more graceful way to do this in Rust?
EDIT
Would it be possible/safe to use this setup like so (using the lazy_static crate), instead of my second block above:
lazy_static! {
static ref APP: App = App::new();
}
Would this reference be guaranteed to be initialized before any other code and destroyed on exit? Is it bad practice to use lazy_static in a library?
This would also make it easier to facilitate access to the FFI through this one struct, since I wouldn't have to bother passing around the reference to the instantiated struct (called _init_ in my original example).
This would also make it safer in some ways, since I could make the App struct default constructor private.
I know of no way of enforcing that a method be called in the main thread beyond strongly-worded documentation. So, ignoring that requirement... :-)
Generally, I'd use std::sync::Once, which seems basically designed for this case:
A synchronization primitive which can be used to run a one-time global
initialization. Useful for one-time initialization for FFI or related
functionality. This type can only be constructed with the ONCE_INIT
value.
Note that there's no provision for any cleanup; many times you just have to leak whatever the library has done. Usually if a library has a dedicated cleanup path, it has also been structured to store all that initialized data in a type that is then passed into subsequent functions as some kind of context or environment. This would map nicely to Rust types.
Warning
Your current code is not as protective as you hope it is. Since your App is an empty struct, an end-user can construct it without calling your method:
let _init_ = App;
We will use a zero-sized argument to prevent this. See also What's the Rust idiom to define a field pointing to a C opaque pointer? for the proper way to construct opaque types for FFI.
Altogether, I'd use something like this:
use std::sync::Once;
mod ffi {
extern "C" {
pub fn InitializeMyCLib();
pub fn CoolMethod(arg: u8);
}
}
static C_LIB_INITIALIZED: Once = Once::new();
#[derive(Copy, Clone)]
struct TheLibrary(());
impl TheLibrary {
fn new() -> Self {
C_LIB_INITIALIZED.call_once(|| unsafe {
ffi::InitializeMyCLib();
});
TheLibrary(())
}
fn cool_method(&self, arg: u8) {
unsafe { ffi::CoolMethod(arg) }
}
}
fn main() {
let lib = TheLibrary::new();
lib.cool_method(42);
}
I did some digging around to see how other FFI libs handle this situation. Here is what I am currently using (similar to #Shepmaster's answer and based loosely on the initialization routine of curl-rust):
fn initialize() {
static INIT: Once = ONCE_INIT;
INIT.call_once(|| unsafe {
ffi::InitializeMyCLib();
assert_eq!(libc::atexit(cleanup), 0);
});
extern fn cleanup() {
unsafe { ffi::DestroyMyCLib(); }
}
}
I then call this function inside the public constructors for my public structs.