Use precomputed big object in actix-web route handler - rust

Is there a way to make an actix-web route handler aware of a pre-computed heavy object, that is needed for the computation of result?
What I intend to do, in the end, is to avoid having to recompute my_big_heavy_object each time a request comes along, and instead compute it once and for all in main, there access it from the index method.
extern crate actix_web;
use actix_web::{server, App, HttpRequest};
fn index(_req: HttpRequest) -> HttpResponse {
// how can I access my_big_heavy_object from here?
let result = do_something_with(my_big_heavy_object);
HttpResponse::Ok()
.content_type("application/json")
.body(result)
}
fn main() {
let my_big_heavy_object = compute_big_heavy_object();
server::new(|| App::new().resource("/", |r| r.f(index)))
.bind("127.0.0.1:8088")
.unwrap()
.run();
}

First, create a struct which is the shared state for your application:
struct AppState {
my_big_heavy_object: HeavyObject,
}
It's better to create a context wrapper like this, rather than just using HeavyObject, so you can add other fields to it later if necessary.
A few objects in Actix will now need to interact with this, so you can make them aware of it by overriding the application state parameter in those types, for example HttpRequest<AppState>.
Your index handler can access the state through the HttpRequest's state property, and now looks like this:
fn index(req: HttpRequest<AppState>) -> HttpResponse {
let result = do_something_with(req.state.my_big_heavy_object);
HttpResponse::Ok()
.content_type("application/json")
.body(result)
}
When building the App, use the with_state constructor instead of new:
server::new(|| {
let app_state = AppState { my_big_heavy_object: compute_big_heavy_object() };
App::with_state(app_state).resource("/", |r| r.f(index))
})
.bind("127.0.0.1:8088")
.unwrap()
.run();
Note that the application state is assumed to be immutable. It sounds like you don't need any handlers to mutate it but if you did then you would have to use something like Cell or RefCell for interior mutability.

Related

Call a random function with variable arguments dynamically

I have a list of functions with variable arguments, and I want to randomly pick one of them, in runtime, and call it, on a loop. I'm looking to enhance the performance of my solution.
I have a function that calculates the arguments based on some randomness, and then (should) return a function pointer, which I could then call.
pub async fn choose_random_endpoint(
&self,
rng: ThreadRng,
endpoint_type: EndpointType,
) -> impl Future<Output = Result<std::string::String, MyError>> {
match endpoint_type {
EndpointType::Type1 => {
let endpoint_arguments = self.choose_endpoint_arguments(rng);
let endpoint = endpoint1(&self.arg1, &self.arg2, &endpoint_arguments.arg3);
endpoint
}
EndpointType::Type2 => {
let endpoint_arguments = self.choose_endpoint_arguments(rng);
let endpoint = endpoint2(
&self.arg1,
&self.arg2,
&endpoint_arguments.arg3,
rng.clone(),
);
endpoint
}
EndpointType::Type3 => {
let endpoint_arguments = self.choose_endpoint_arguments(rng);
let endpoint = endpoint3(
&self.arg1,
&self.arg2,
&endpoint_arguments.arg3,
rng.clone(),
);
endpoint
}
}
}
The error I obtain is
expected opaque type `impl Future<Output = Result<std::string::String, MyError>>` (opaque type at <src/calls/type1.rs:14:6>)
found opaque type `impl Future<Output = Result<std::string::String, MyError>>` (opaque type at <src/type2.rs:19:6>)
. The compiler advises me to await the endpoints, and this solves the issue, but is there a performance overhead to this?
Outer function:
Aassume there is a loop calling this function:
pub async fn make_call(arg1: &str, arg2: &str) -> Result<String> {
let mut rng = rand::thread_rng();
let random_endpoint_type = choose_random_endpoint_type(&mut rng);
let random_endpoint = choose_random_endpoint(&rng, random_endpoint_type);
// call the endpoint
Ok(response)
}
Now, I want to call make_call every X seconds, but I don't want my main thread to block during the endpoint calls, as those are expensive. I suppose the right way to approach this is spawning a new thread per X seconds of interval, that call make_call?
Also, performance-wise: having so many clones on the rng seems quite expensive. Is there a more performant way to do this?
The error you get is sort of unrelated to async. It's the same one you get when you try to return two different iterators from a function. Your function as written doesn't even need to be async. I'm going to remove async from it when it's not needed, but if you need async (like for implementing an async-trait) then you can add it back and it'll probably work the same.
I've reduced your code into a simpler example that has the same issue (playground):
async fn a() -> &'static str {
"a"
}
async fn b() -> &'static str {
"b"
}
fn a_or_b() -> impl Future<Output = &'static str> {
if rand::random() {
a()
} else {
b()
}
}
What you're trying to write
When you want to return a trait, but the specific type that implements that trait isn't known at compile time, you can return a trait object. Futures need to be Unpin to be awaited, so this uses a pinned box (playground).
fn a_or_b() -> Pin<Box<dyn Future<Output = &'static str>>> {
if rand::random() {
Box::pin(a())
} else {
Box::pin(b())
}
}
You may need the type to be something like Pin<Box<dyn Future<Output = &'static str> + Send + Sync + 'static>> depending on the context.
What you should write
I think the only reason you'd do the above is if you want to generate the future with some kind of async rng, then do something else, and then run the generated future after that. Otherwise there's no need to have nested futures; just await the inner futures when you call them (playground).
async fn a_or_b() -> &'static str {
if rand::random() {
a().await
} else {
b().await
}
}
This is conceptually equivalent to the Pin<Box> method, just without having to allocate a Box. Instead, you have an opaque type that implements Future itself.
Blocking
The blocking behavior of these is only slightly different. Pin<Box> will block on non-async things when you call it, while the async one will block on non-async things where you await it. This is probably mostly the random generation.
The blocking behavior of the endpoint is the same and depends on what happens inside there. It'll block or not block wherever you await either way.
If you want to have multiple make_call calls happening at the same time, you'll need to do that outside the function anyway. Using the tokio runtime, it would look something like this:
use tokio::task;
use futures::future::join_all;
let tasks: Vec<_> = (0..100).map(|_| task::spawn(make_call())).collect();
let results = join_all(tasks).await;
This also lets you do other stuff while the futures are running, in between collect(); and let results.
If something inside your function blocks, you'd want to spawn it with task::spawn_blocking (and then await that handle) so that the await call in make_call doesn't get blocked.
RNG
If your runtime is multithreaded, the ThreadRng will be an issue. You could create a type that implements Rng + Send with from_entropy, and pass that into your functions. Or you can call thread_rng or even just rand::random where you need it. This makes a new rng per thread, but will reuse them on later calls since it's a thread-local static. On the other hand, if you don't need as much randomness, you can go with a Rng + Send type from the beginning.
If your runtime isn't multithreaded, you should be able to pass &mut ThreadRng all the way through, assuming the borrow checker is smart enough. You won't be able to pass it into an async function and then spawn it, though, so you'd have to create a new one inside that function.

How can I store variables in traits so that it can be used in other methods on trait?

I have several objects in my code that have common functionality. These objects act essentially like services where they have a life-time (controlled by start/stop) and perform work until their life-time ends. I am in the process of trying to refactor my code to reduce duplication but getting stuck.
At a high level, my objects all have the code below implemented in them.
impl SomeObject {
fn start(&self) {
// starts a new thread.
// stores the 'JoinHandle' so thread can be joined later.
}
fn do_work(&self) {
// perform work in the context of the new thread.
}
fn stop(&self) {
// interrupts the work that this object is doing.
// stops the thread.
}
}
Essentially, these objects act like "services" so in order to refactor, my first thought was that I should create a trait called "service" as shown below.
trait Service {
fn start(&self) {}
fn do_work(&self);
fn stop(&self) {}
}
Then, I can just update my objects to each implement the "Service" trait. The issue that I am having though, is that since traits are not allowed to have fields/properties, I am not sure how I can go about saving the 'JoinHandle' in the trait so that I can use it in the other methods.
Is there an idiomatic way to handle this problem in Rust?
tldr; how can I save variables in a trait so that they can be re-used in different trait methods?
Edit:
Here is the solution I settled on. Any feedback is appreciated.
extern crate log;
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::Arc;
use std::thread::JoinHandle;
pub struct Context {
name: String,
thread_join_handle: JoinHandle<()>,
is_thread_running: Arc<AtomicBool>,
}
pub trait Service {
fn start(&self, name: String) -> Context {
log::trace!("starting service, name=[{}]", name);
let is_thread_running = Arc::new(AtomicBool::new(true));
let cloned_is_thread_running = is_thread_running.clone();
let thread_join_handle = std::thread::spawn(move || loop {
while cloned_is_thread_running.load(Ordering::SeqCst) {
Self::do_work();
}
});
log::trace!("started service, name=[{}]", name);
return Context {
name: name,
thread_join_handle: thread_join_handle,
is_thread_running: is_thread_running,
};
}
fn stop(context: Context) {
log::trace!("stopping service, name=[{}]", context.name);
context.is_thread_running.store(false, Ordering::SeqCst);
context
.thread_join_handle
.join()
.expect("joining service thread");
log::trace!("stopped service, name=[{}]", context.name);
}
fn do_work();
}
I think you just need to save your JoinHandle in the struct's state itself, then the other methods can access it as well because they all get all the struct's data passed to them already.
struct SomeObject {
join_handle: JoinHandle;
}
impl Service for SomeObject {
fn start(&mut self) {
// starts a new thread.
// stores the 'JoinHandle' so thread can be joined later.
self.join_handle = what_ever_you_wanted_to_set //Store it here like this
}
fn do_work(&self) {
// perform work in the context of the new thread.
}
fn stop(&self) {
// interrupts the work that this object is doing.
// stops the thread.
}
}
Hopefully that works for you.
Depending on how you're using the trait, you could restructure your code and make it so start returns a JoinHandle and the other functions take the join handle as input
trait Service {
fn start(&self) -> JoinHandle;
fn do_work(&self, handle: &mut JoinHandle);
fn stop(&self, handle: JoinHandle);
}
(maybe with different function arguments depending on what you need). this way you could probably cut down on duplicate code by putting all the code that handles the handles (haha) outside of the structs themselves and makes it more generic. If you want the structs the use the JoinHandle outside of this trait, I'd say it's best to just do what Equinox suggested and just make it a field.

Run async function in run_interval and return result

I need to run an async function in actix::prelude::AsyncContext::run_interval, but I need to also pass in a struct member and return the result (not the future). This is a somewhat more complex version of this question here. As can be seen in the commented section below, I have tried a few approaches but all of them fail for one reason or another.
I have looked at a few related resources, including the AsyncContext trait and these StackOverflow questions: 3, 4.
Here is my example code (actix crate is required in Cargo.toml):
use std::time::Duration;
use actix::{Actor, Arbiter, AsyncContext, Context, System};
struct MyActor {
id: i32
}
impl MyActor {
fn new(id: i32) -> Self {
Self {
id: id,
}
}
fn heartbeat(&self, ctx: &mut <Self as Actor>::Context) {
ctx.run_interval(Duration::from_secs(1), |act, ctx| {
//lifetime issue
//let res = 0;
//Arbiter::spawn(async {
// res = two(act.id).await;
//});
//future must return `()`
//let res = Arbiter::spawn(two(act.id));
//async closures unstable
//let res = Arbiter::current().exec(async || {
// two(act.id).await
//});
});
}
}
impl Actor for MyActor {
type Context = Context<Self>;
fn started(&mut self, ctx: &mut Self::Context) {
self.heartbeat(ctx);
}
}
// assume functions `one` and `two` live in another module
async fn one(id: i32) -> i32 {
// assume something is done with id here
let x = id;
1
}
async fn two(id: i32) -> i32 {
let x = id;
// assume this may call other async functions
one(x).await;
2
}
fn main() {
let mut system = System::new("test");
system.block_on(async { MyActor::new(10).start() });
system.run();
}
Rust version:
$ rustc --version
rustc 1.50.0 (cb75ad5db 2021-02-10)
Using Arbiter::spawn would work, but the issue is with the data being accessed from inside the async block that's passed to Arbiter::spawn. Since you're accessing act from inside the async block, that reference will have to live longer than the closure that calls Arbiter::spawn. In fact, in will have to have a lifetime of 'static since the future produced by the async block could potentially live until the end of the program.
One way to get around this in this specific case, given that you need an i32 inside the async block, and an i32 is a Copy type, is to move it:
ctx.run_interval(Duration::from_secs(1), |act, ctx| {
let id = act.id;
Arbiter::spawn(async move {
two(id).await;
});
});
Since we're using async move, the id variable will be moved into the future, and will thus be available whenever the future is run. By assigning it to id first, we are actually copying the data, and it's the copy (id) that will be moved.
But this might not be what you want, if you're trying to get a more general solution where you can access the object inside the async function. In that case, it gets a bit tricker, and you might want to consider not using async functions if possible. If you must, it might be possible to have a separate object with the data you need which is surrounded by std::rc::Rc, which can then be moved into the async block without duplicating the underlying data.

Why does my shared actix-web state sometimes reset back to the original value?

I'm trying to implement shared state in the Actix-Web framework with Arc and Mutex. The following code compiles, but when I run it, the counter sometimes goes all the way back to 0. How do I prevent that from happening?
use actix_web::{web, App, HttpServer};
use std::sync::{Arc, Mutex};
// This struct represents state
struct AppState {
app_name: String,
counter: Arc<Mutex<i64>>,
}
fn index(data: web::Data<AppState>) -> String {
let mut counter = data.counter.lock().unwrap();
*counter += 1;
format!("{}", counter)
}
pub fn main() {
HttpServer::new(|| {
App::new()
.hostname("hello world")
.register_data(web::Data::new(AppState {
app_name: String::from("Actix-web"),
counter: Arc::new(Mutex::new(0)),
}))
.route("/", web::get().to(index))
})
.bind("127.0.0.1:8088")
.unwrap()
.run()
.unwrap();
}
HttpServer::new takes a closure which is called for each thread that runs the server. This means that multiple instances of AppState is created, one for each thread. Depending on which thread answers the HTTP request, you will get different instances of data and therefor different counter values.
To prevent this from happening, create web::Data<AppState> outside the closure and use a cloned reference inside the HttpServer::new closure.

What's the easiest way to get the HTML output of an actix-web endpoint handler to be rendered properly?

I've defined an endpoint with actix-web like so:
#[derive(Deserialize)]
struct RenderInfo {
filename: String,
}
fn render(info: actix_web::Path<RenderInfo>) -> Result<String> {
// ...
}
App::new()
.middleware(middleware::Logger::Default())
.resource("/{filename}", |r| r.get().with(render))
The problem I've run into is that the raw HTML gets displayed in the browser rather than being rendered. I assume the content-type is not being set properly.
Most of the actix-web examples I saw used impl Responder for the return type, but I wasn't able to figure out how to fix the type inference issues that created. The reason seems to have something to do with file operations returning a standard failure::Error-based type. It looks like actix_web requires implementation of a special WebError to block unintended propagation of errors. For this particular instance, I don't really care, because it's more of an internal tool.
From the actix-web examples, use HttpResponse:
fn welcome(req: &HttpRequest) -> Result<HttpResponse> {
println!("{:?}", req);
// session
let mut counter = 1;
if let Some(count) = req.session().get::<i32>("counter")? {
println!("SESSION value: {}", count);
counter = count + 1;
}
// set counter to session
req.session().set("counter", counter)?;
// response
Ok(HttpResponse::build(StatusCode::OK)
.content_type("text/html; charset=utf-8")
.body(include_str!("../static/welcome.html")))
}

Resources