deno_runtime running multiple invokes on single worker concurrently - rust

I'm trying to run multiple invocation of the same script on a single deno MainWorker concurrently, and waiting for their
results (since the scripts can be async). Conceptually, I want something like the loop in run_worker below.
type Tx = Sender<(String, Sender<String>)>;
type Rx = Receiver<(String, Sender<String>)>;
struct Runner {
worker: MainWorker,
futures: FuturesUnordered<Pin<Box<dyn Future<Output=(String, Result<Global<Value>, Error>)>>>>,
response_futures: FuturesUnordered<Pin<Box<dyn Future<Output=(String, Result<(), SendError<String>>)>>>>,
result_senders: HashMap<String, Sender<String>>,
}
impl Runner {
fn new() ...
async fn run_worker(&mut self, rx: &mut Rx, main_module: ModuleSpecifier, user_module: ModuleSpecifier) {
self.worker.execute_main_module(&main_module).await.unwrap();
self.worker.preload_side_module(&user_module).await.unwrap();
loop {
tokio::select! {
msg = rx.recv() => {
if let Some((id, sender)) = msg {
let global = self.worker.js_runtime.execute_script("test", "mod.entry()").unwrap();
self.result_senders.insert(id, sender);
self.futures.push(Box::pin(async {
let resolved = self.worker.js_runtime.resolve_value(global).await;
return (id, resolved);
}));
}
},
script_result = self.futures.next() => {
if let Some((id, out)) = script_result {
self.response_futures.push(Box::pin(async {
let value = deserialize_value(out.unwrap(), &mut self.worker);
let res = self.result_senders.remove(&id).unwrap().send(value).await;
return (id.clone(), res);
}));
}
},
// also handle response_futures here
else => break,
}
}
}
}
The worker can't be borrowed as mutable multiple times, so this won't work. So the worker has to be a RefCell, and
I've created a BorrowingFuture:
struct BorrowingFuture {
worker: RefCell<MainWorker>,
global: Global<Value>,
id: String
}
And its poll implementation:
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
match Pin::new(&mut Box::pin(self.worker.borrow_mut().js_runtime.resolve_value(self.global.clone()))).poll(cx) {
Poll::Ready(result) => Poll::Ready((self.id.clone(), result)),
Poll::Pending => {
cx.waker().clone().wake_by_ref();
Poll::Pending
}
}
}
So the above
self.futures.push(Box::pin(async {
let resolved = self.worker.js_runtime.resolve_value(global).await;
return (id, resolved);
}));
would become
self.futures.push(Box::pin(BorrowingFuture{worker: self.worker, global: global.clone(), id: id.clone()}));
and this would have to be done for the response_futures above as well.
But I see a few issues with this.
Creating a new future on every poll and then polling that seems wrong, but it does work.
It probably has a performance impact because new objects are created constantly.
The same issue would happen for the response futures, which would call send on each poll, which seems completely wrong.
The waker.wake_by_ref is called on every poll, because there is no way to know when a script result
will resolve. This results in the future being polled thousands (and more) times per second (always creating a new object),
which could be the same as checking it in a loop, I guess.
Note My current setup doesn't use select!, but an enum as Output from multiple Future implementations, pushed into
a single FuturesUnordered, and then matched to handle the correct type (script, send, receive). I used select here because
it's far less verbose, and gets the point across.
Is there a way to do this better/more efficiently? Or is it just not the way MainWorker was meant to be used?
main for completeness:
#[tokio::main]
async fn main() {
let main_module = deno_runtime::deno_core::resolve_url(MAIN_MODULE_SPECIFIER).unwrap();
let user_module = deno_runtime::deno_core::resolve_url(USER_MODULE_SPECIFIER).unwrap();
let (tx, mut rx) = channel(1);
let (result_tx, mut result_rx) = channel(1);
let handle = thread::spawn(move || {
let runtime = tokio::runtime::Builder::new_multi_thread().enable_all().build().unwrap();
let mut runner = Runner::new();
runtime.block_on(runner.run_worker(&mut rx, main_module, user_module));
});
tx.send(("test input".to_string(), result_tx)).await.unwrap();
let result = result_rx.recv().await.unwrap();
println!("result from worker {}", result);
handle.join().unwrap();
}

Related

How can I run asynchronous tasks on a single thread in order?

I am working on a program using rust-tokio for asynchronous execution. The main function periodically calls a function to append to a CSV file to log operation over time.
I would like to make the CSV creation function asynchronous and run it as a separate task so I can continue the main function if CSV creation is taking some time (like waiting for another application like Excel to release it).
Is there an elegant way to accomplish this?
LocalSet almost seems like it would do the job, but the tasks need to execute in order so the CSV is chronological. To me, the documentation doesn't seem to guarantee this.
Here's some pseudo code to illustrate the idea. Essentially, I'm thinking a queue of tasks that need to be completed.
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let local = task::LocalSetOrdered::new(); //This is a fictitious struct
let mut data: usize = 10; //For simplicity, just store a single number
loop {
// Some operations here
data = data + 1;
let data_clone = data.clone();
//Add a new task to complete after all prior tasks
local.push(async move {
match append_to_csv(data_clone).await {
Ok(_) => Ok(()),
Err(_) => Err(()),
}
});
sleep(Duration::from_secs(60)).await;
}
Ok(())
}
async fn append_to_csv(data_in: usize) -> Result<(), Box<dyn Error>> {
loop {
let file = match OpenOptions::new().write(true).append(true).open(filename) {
Ok(f) => f,
Err(_) => {
//Error opening the file, try again
sleep(Duration::from_secs(60)).await;
continue;
}
};
let wtr = csv::Writer::from_writer(file);
let date_time = Utc::now();
wtr.write_field(format!("{}", date_time.format("%Y/%m/%d %H:%M")))?;
wtr.write_field(format!("{}", data_in))?;
wtr.write_record(None::<&[u8]>)?; //Finish the line
}
}
You could use a worker task to write to the csv file, and a channel to pass data to be written
use tokio::sync::mpsc::{channel, Receiver};
#[derive(Debug)]
pub struct CsvData(i32, &'static str);
async fn append_to_csv(mut rx: Receiver<CsvData>) {
let mut wtr = csv::Writer::from_writer(std::io::stdout());
while let Some(data) = rx.recv().await {
wtr.write_record([&data.0.to_string(), data.1]).unwrap();
wtr.flush().unwrap();
}
}
#[tokio::main]
async fn main() {
let (tx, rx) = channel(10);
tokio::spawn(async {
append_to_csv(rx).await;
});
for i in 0.. {
tx.send(CsvData(i, "Hello world")).await.unwrap();
}
}
The channel sender can be cloned if you need to write data sourced from multiple tasks.

Receiver on tokio's mpsc channel only receives messages when buffer is full

I've spent a few hours trying to figure this out and I'm pretty done. I found the question with a similar name, but that looks like something was blocking synchronously which was messing with tokio. That very well may be the issue here, but I have absolutely no idea what is causing it.
Here is a heavily stripped down version of my project which hopefully gets the issue across.
use std::io;
use futures_util::{
SinkExt,
stream::{SplitSink, SplitStream},
StreamExt,
};
use tokio::{
net::TcpStream,
sync::mpsc::{channel, Receiver, Sender},
};
use tokio_tungstenite::{
connect_async,
MaybeTlsStream,
tungstenite::Message,
WebSocketStream,
};
#[tokio::main]
async fn main() {
connect_to_server("wss://a_valid_domain.com".to_string()).await;
}
async fn read_line() -> String {
loop {
let mut str = String::new();
io::stdin().read_line(&mut str).unwrap();
str = str.trim().to_string();
if !str.is_empty() {
return str;
}
}
}
async fn connect_to_server(url: String) {
let (ws_stream, _) = connect_async(url).await.unwrap();
let (write, read) = ws_stream.split();
let (tx, rx) = channel::<ChannelMessage>(100);
tokio::spawn(channel_thread(write, rx));
tokio::spawn(handle_std_input(tx.clone()));
read_messages(read, tx).await;
}
#[derive(Debug)]
enum ChannelMessage {
Text(String),
Close,
}
// PROBLEMATIC FUNCTION
async fn channel_thread(
mut write: SplitSink<WebSocketStream<MaybeTlsStream<TcpStream>>, Message>,
mut rx: Receiver<ChannelMessage>,
) {
while let Some(msg) = rx.recv().await {
println!("{:?}", msg); // This only fires when buffer is full
match msg {
ChannelMessage::Text(text) => write.send(Message::Text(text)).await.unwrap(),
ChannelMessage::Close => {
write.close().await.unwrap();
rx.close();
return;
}
}
}
}
async fn read_messages(
mut read: SplitStream<WebSocketStream<MaybeTlsStream<TcpStream>>>,
tx: Sender<ChannelMessage>,
) {
while let Some(msg) = read.next().await {
let msg = match msg {
Ok(m) => m,
Err(_) => continue
};
match msg {
Message::Text(m) => println!("{}", m),
Message::Close(_) => break,
_ => {}
}
}
if !tx.is_closed() {
let _ = tx.send(ChannelMessage::Close).await;
}
}
async fn handle_std_input(tx: Sender<ChannelMessage>) {
loop {
let str = read_line().await;
if tx.is_closed() {
break;
}
tx.send(ChannelMessage::Text(str)).await.unwrap();
}
}
As you can see, what I'm trying to do is:
Connect to a websocket
Print outgoing messages from the websocket
Forward any input from stdin to the websocket
Also a custom heartbeat solution which was trimmed out
The issue lies in the channel_thread() function. I move the websocket writer into this function as well as the channel receiver. The issue is, it only loops over the sent objects when the buffer is full.
I've spent a lot of time trying to solve this, any help is greatly appreciated.
Here, you make a blocking synchronous call in an async context:
async fn read_line() -> String {
loop {
let mut str = String::new();
io::stdin().read_line(&mut str).unwrap();
// ^^^^^^^^^^^^^^^^^^^
// This is sync+blocking
str = str.trim().to_string();
if !str.is_empty() {
return str;
}
}
}
You never ever make blocking synchronous calls in an async context, because that prevents the entire thread from running other async tasks. Your channel receiver task is likely also assigned to this thread, so it's having to wait until all the blocking calls are done and whatever invokes this function yields back to the async runtime.
Tokio has its own async version of stdin, which you should use instead.

Tokio channel sends, but doesn't receive

TL;DR I'm trying to have a background thread that's ID'd that is controlled via that ID and web calls, and the background threads doesn't seem to be getting the message via all the types of channels I've tried.
I've tried both the std channels as well as tokio's, and of those I've tried all but the watcher type from tokio. All have the same result which probably means that I've messed something up somewhere without realizing it, but I can't find the issue:
use std::collections::{
hash_map::Entry::{Occupied, Vacant},
HashMap,
};
use std::sync::Arc;
use tokio::sync::mpsc::{self, UnboundedSender};
use tokio::sync::RwLock;
use tokio::task::JoinHandle;
use uuid::Uuid;
use warp::{http, Filter};
#[derive(Default)]
pub struct Switcher {
pub handle: Option<JoinHandle<bool>>,
pub pipeline_end_tx: Option<UnboundedSender<String>>,
}
impl Switcher {
pub fn set_sender(&mut self, tx: UnboundedSender<String>) {
self.pipeline_end_tx = Some(tx);
}
pub fn set_handle(&mut self, handle: JoinHandle<bool>) {
self.handle = Some(handle);
}
}
const ADDR: [u8; 4] = [0, 0, 0, 0];
const PORT: u16 = 3000;
type RunningPipelines = Arc<RwLock<HashMap<String, Arc<RwLock<Switcher>>>>>;
#[tokio::main]
async fn main() {
let running_pipelines = Arc::new(RwLock::new(HashMap::<String, Arc<RwLock<Switcher>>>::new()));
let session_create = warp::post()
.and(with_pipelines(running_pipelines.clone()))
.and(warp::path("session"))
.then(|pipelines: RunningPipelines| async move {
println!("session requested OK!");
let id = Uuid::new_v4();
let mut switcher = Switcher::default();
let (tx, mut rx) = mpsc::unbounded_channel::<String>();
switcher.set_sender(tx);
let t = tokio::spawn(async move {
println!("Background going...");
//This would be something processing in the background until it received the end signal
match rx.recv().await {
Some(v) => {
println!(
"Got end message:{} YESSSSSS#!##!!!!!!!!!!!!!!!!1111eleven",
v
);
}
None => println!("Error receiving end signal:"),
}
println!("ABORTING HANDLE");
true
});
let ret = HashMap::from([("session_id", id.to_string())]);
switcher.set_handle(t);
{
pipelines
.write()
.await
.insert(id.to_string(), Arc::new(RwLock::new(switcher)));
}
Ok(warp::reply::json(&ret))
});
let session_end = warp::delete()
.and(with_pipelines(running_pipelines.clone()))
.and(warp::path("session"))
.and(warp::query::<HashMap<String, String>>())
.then(
|pipelines: RunningPipelines, p: HashMap<String, String>| async move {
println!("session end requested OK!: {:?}", p);
match p.get("session_id") {
None => Ok(warp::reply::with_status(
"Please specify session to end",
http::StatusCode::BAD_REQUEST,
)),
Some(id) => {
let mut pipe = pipelines.write().await;
match pipe.entry(String::from(id)) {
Occupied(handle) => {
println!("occupied");
let (k, v) = handle.remove_entry();
drop(pipe);
println!("removed from hashmap, key:{}", k);
let s = v.write().await;
if let Some(h) = &s.handle {
if let Some(tx) = &s.pipeline_end_tx {
match tx.send("goodbye".to_string()) {
Ok(res) => {
println!(
"sent end message|{:?}| to fpipeline: {}",
res, id
);
//Added this to try to get it to at least Error on the other side
drop(tx);
},
Err(err) => println!(
"ERROR sending end message to pipeline({}):{}",
id, err
),
};
} else {
println!("no sender channel found for pipeline: {}", id);
};
h.abort();
} else {
println!(
"no luck finding the value in handle in the switcher: {}",
id
);
};
}
Vacant(_) => {
println!("no luck finding the handle in the pipelines: {}", id)
}
};
Ok(warp::reply::with_status("done", http::StatusCode::OK))
}
}
},
);
let routes = session_create
.or(session_end)
.recover(handle_rejection)
.with(warp::cors().allow_any_origin());
println!("starting server...");
warp::serve(routes).run((ADDR, PORT)).await;
}
async fn handle_rejection(
err: warp::Rejection,
) -> Result<impl warp::Reply, std::convert::Infallible> {
Ok(warp::reply::json(&format!("{:?}", err)))
}
fn with_pipelines(
pipelines: RunningPipelines,
) -> impl Filter<Extract = (RunningPipelines,), Error = std::convert::Infallible> + Clone {
warp::any().map(move || pipelines.clone())
}
depends:
[dependencies]
warp = "0.3"
tokio = { version = "1", features = ["full"] }
uuid = { version = "0.8.2", features = ["serde", "v4"] }
Results when I boot up, send a "create" request, and then an "end" request with the received ID:
starting server...
session requested OK!
Background going...
session end requested OK!: {"session_id": "6b984a45-38d8-41dc-bf95-422f75c5a429"}
occupied
removed from hashmap, key:6b984a45-38d8-41dc-bf95-422f75c5a429
sent end message|()| to fpipeline: 6b984a45-38d8-41dc-bf95-422f75c5a429
You'll notice that the background thread starts (and doesn't end) when the "create" request is made, but when the "end" request is made, while everything appears to complete successfully from the request(web) side, the background thread doesn't ever receive the message. As I've said I've tried all different channel types and moved things around to get it into this configuration... i.e. flattened and thread safetied as much as I could or at least could think of. I'm greener than I would like in rust, so any help would be VERY appreciated!
I think that the issue here is that you are sending the message and then immediately aborting the background task:
tx.send("goodbye".to_string());
//...
h.abort();
And the background task does not have time to process the message, as the abort is of higher priority.
What you need is to join the task, not to abort it.
Curiously, tokio tasks handles do not have a join() method, instead you wait for the handle itself. But for that you need to own the handle, so first you have to extract the handle from the Switcher:
let mut s = v.write().await;
//steal the task handle
if let Some(h) = s.handle.take() {
//...
tx.send("goodbye".to_string());
//...
//join the task
h.await.unwrap();
}
Note that joining a task may fail, in case the task is aborted or panicked. I'm just panicking in the code above, but you may want to do something different.
Or... you could not to wait for the task. In tokio if you drop a task handle, it will be detached. Then, it will finish when it finishes.

Waiting for a list of futures in Rust

I am attempting to make a future that continuously finds new work to do and then maintains a set of futures for those work items. I would like to make sure neither my main future that finds work to be blocked for long periods of time and to have my work being done concurrently.
Here is a rough overview of what I am trying to do. Specifically isDone does not exist and also from what I can understand from the docs isn't necessarily a valid way to use futures in Rust. What is the idomatic way of doing this kind of thing?
use std::collections::HashMap;
use tokio::runtime::Runtime;
async fn find_work() -> HashMap<i64, String> {
// Go read from the DB or something...
let mut work = HashMap::new();
work.insert(1, "test".to_string());
work.insert(2, "test".to_string());
return work;
}
async fn do_work(id: i64, value: String) -> () {
// Result<(), Error> {
println!("{}: {}", id, value);
}
async fn async_main() -> () {
let mut pending_work = HashMap::new();
loop {
for (id, value) in find_work().await {
if !pending_work.contains_key(&id) {
let fut = do_work(id, value);
pending_work.insert(id, fut);
}
}
pending_work.retain(|id, fut| {
if isDone(fut) {
// do something with the result
false
} else {
true
}
});
}
}
fn main() {
let runtime = Runtime::new().unwrap();
let exec = runtime.executor();
exec.spawn(async_main());
runtime.shutdown_on_idle();
}

Why does this Delay future inside poll() not work in my custom Stream type?

I want to print "Hello" once a second.
Quoting the doc:
Futures use a poll based model. The consumer of a future repeatedly calls the poll function. The future then attempts to complete. If the future is able to complete, it returns Async::Ready(value). If the future is unable to complete due to being blocked on an internal resource (such as a TCP socket), it returns Async::NotReady.
My poll function returns NotReady if Delays return is NotReady, but nothing is printed to stdout.
use futures::{Async, Future, Stream}; // 0.1.25
use std::time::{Duration, Instant};
use tokio::timer::Delay; // 0.1.15
struct SomeStream;
impl Stream for SomeStream {
type Item = String;
type Error = ();
fn poll(&mut self) -> Result<Async<Option<Self::Item>>, Self::Error> {
let when = Instant::now() + Duration::from_millis(1000);
let mut task = Delay::new(when).map_err(|e| eprintln!("{:?}", e));
match task.poll() {
Ok(Async::Ready(value)) => {}
Ok(Async::NotReady) => return Ok(Async::NotReady),
Err(err) => return Err(()),
}
Ok(Async::Ready(Some("Hello".to_string())))
}
}
fn main() {
let s = SomeStream;
let future = s
.for_each(|item| {
println!("{:?}", item);
Ok(())
})
.map_err(|e| {});
tokio::run(future);
}
The main issue here is that state management is missing. You are creating a new Delay future every time the stream is polled, rather than holding on to it until it's resolved.
This would lead to never seeing any items coming out of the stream, since these futures are only being polled once, likely yielding NotReady each time.
You need to keep track of the delay future in your type SomeStream. In this case, one can use an option, so as to also identify whether we need to create a new delay.
#[derive(Debug, Default)]
struct SomeStream {
delay: Option<Delay>,
}
The subsequent code for SomeStream::poll, with better error handling and more idiomatic constructs, would become something like this:
impl Stream for SomeStream {
type Item = String;
type Error = Box<dyn std::error::Error + Send + Sync>; // generic error
fn poll(&mut self) -> Result<Async<Option<Self::Item>>, Self::Error> {
let delay = self.delay.get_or_insert_with(|| {
let when = Instant::now() + Duration::from_millis(1000);
Delay::new(when)
});
match delay.poll() {
Ok(Async::Ready(value)) => {
self.delay = None;
Ok(Async::Ready(Some("Hello".to_string())))
},
Ok(Async::NotReady) => Ok(Async::NotReady),
Err(err) => Err(err.into()),
}
}
}
Or, even better, using the try_ready! macro, which makes the return of errors and NotReady signals with less boilerplate.
fn poll(&mut self) -> Result<Async<Option<Self::Item>>, Self::Error> {
let delay = self.delay.get_or_insert_with(|| {
let when = Instant::now() + Duration::from_millis(1000);
Delay::new(when)
});
try_ready!(delay.poll());
// tick!
self.delay = None;
Ok(Async::Ready(Some("Hello".to_string())))
}
(Playground)

Resources