Why does my program get stuck at epoll_wait? [closed] - multithreading

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 months ago.
Improve this question
I have some structs that are used to de-serialize requests from GCP alerting. The top-level structs implement FromResponse, and the nested structs all implement serde's Deserialize and Serialize traits.
Here are the structs (omitted ones that are filled by serde for brevity.)
#[derive(Debug, Clone, PartialEq)]
struct GcpAlert {
pub headers: GcpHeaders,
pub body: GcpBody,
}
#[async_trait]
impl FromRequest<Body> for GcpAlert {
type Rejection = StatusCode;
async fn from_request(req: &mut RequestParts<Body>) -> Result<Self, Self::Rejection> {
let body = GcpBody::from_request(req).await?;
let headers = GcpHeaders::from_request(req).await?;
Ok(Self { headers, body })
}
}
#[derive(Debug, Clone)]
struct GcpHeaders {
pub host: TypedHeader<Host>,
pub content_length: TypedHeader<ContentLength>,
pub content_type: TypedHeader<ContentType>,
pub user_agent: TypedHeader<UserAgent>,
}
#[async_trait]
impl FromRequest<Body> for GcpHeaders {
type Rejection = StatusCode;
async fn from_request(req: &mut RequestParts<Body>) -> Result<Self, Self::Rejection> {
let bad_req = StatusCode::BAD_REQUEST;
let host: TypedHeader<Host> =
TypedHeader::from_request(req).await.map_err(|_| bad_req)?;
let content_length: TypedHeader<ContentLength> =
TypedHeader::from_request(req).await.map_err(|_| bad_req)?;
let content_type: TypedHeader<ContentType> =
TypedHeader::from_request(req).await.map_err(|_| bad_req)?;
let user_agent: TypedHeader<UserAgent> =
TypedHeader::from_request(req).await.map_err(|_| bad_req)?;
Ok(Self {
host,
content_length,
content_type,
user_agent,
})
}
}
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq)]
struct GcpBody {
pub incident: GcpIncident,
pub version: Box<str>,
}
#[async_trait]
impl FromRequest<Body> for GcpBody {
type Rejection = StatusCode;
async fn from_request(req: &mut RequestParts<Body>) -> Result<Self, Self::Rejection> {
let serv_err = StatusCode::INTERNAL_SERVER_ERROR;
let bad_req = StatusCode::BAD_REQUEST;
let body = req.body_mut().as_mut().ok_or(serv_err)?;
let buffer = body::to_bytes(body).await.map_err(|_| serv_err)?;
Ok(serde_json::from_slice(&buffer).map_err(|_|bad_req)? )
}
}
To test this, I have created a test that compares a manually instantiated GcpAlert struct and one created via axum. Note that I've omitted details about the manually created struct and request, as I'm fairly certain they are unrelated.
#[tokio::test]
async fn test_request_deserialization() {
async fn handle_alert(alert: GcpAlert) {
let expected = GcpAlert {
headers: GcpHeaders { /* headers */ },
body: GcpBody { /* body */ }
};
assert_eq!(alert, expected);
}
let app = Router::new().route("/", post(handle_alert));
// TestClient is similar to this: https://github.com/tokio-rs/axum/blob/main/axum/src/test_helpers/test_client.rs
let client = TestClient::new(app);
client.post("/")
.header("host", "<host>")
.header("content-length", 1024)
.header("content-type", ContentType::json().to_string())
.header("user-agent", "<user-agent>")
.header("accept-enconding", "gzip, deflate, br")
.body(/* body */)
.send().await;
}
My issue is that the program freezes in the following line of GcpBody's FromRequest impl.
let buffer = body::to_bytes(body).await.map_err(|_| serv_err)?;
I've tried to debug the issue a little bit, but I'm not really familiar with assembly/llvm/etc.
It looks like two threads are active for this. I can artificially increase the number of threads by using the multi-threading attribute on the test, but it doesn't change the end result, just a bigger call stack.
Thread1 callstack:
syscall (#syscall:12)
std::sys::unix::futex::futex_wait (#std::sys::unix::futex::futex_wait:64)
std::sys_common::thread_parker::futex::Parker::park_timeout (#std::thread::park_timeout:25)
std::thread::park_timeout (#std::thread::park_timeout:18)
std::sync::mpsc::blocking::WaitToken::wait_max_until (#std::sync::mpsc::blocking::WaitToken::wait_max_until:18)
std::sync::mpsc::shared::Packet<T>::recv (#std::sync::mpsc::shared::Packet<T>::recv:94)
std::sync::mpsc::Receiver<T>::recv_deadline (#test::run_tests:1771)
std::sync::mpsc::Receiver<T>::recv_timeout (#test::run_tests:1696)
test::run_tests (#test::run_tests:1524)
test::console::run_tests_console (#test::console::run_tests_console:290)
test::test_main (#test::test_main:102)
test::test_main_static (#test::test_main_static:34)
gcp_teams_alerts::main (/home/ak_lo/ドキュメント/Rust/gcp-teams-alerts/src/main.rs:1)
core::ops::function::FnOnce::call_once (#core::ops::function::FnOnce::call_once:6)
std::sys_common::backtrace::__rust_begin_short_backtrace (#std::sys_common::backtrace::__rust_begin_short_backtrace:6)
std::rt::lang_start::{{closure}} (#std::rt::lang_start::{{closure}}:7)
core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once (#std::rt::lang_start_internal:242)
std::panicking::try::do_call (#std::rt::lang_start_internal:241)
std::panicking::try (#std::rt::lang_start_internal:241)
std::panic::catch_unwind (#std::rt::lang_start_internal:241)
std::rt::lang_start_internal::{{closure}} (#std::rt::lang_start_internal:241)
std::panicking::try::do_call (#std::rt::lang_start_internal:241)
std::panicking::try (#std::rt::lang_start_internal:241)
std::panic::catch_unwind (#std::rt::lang_start_internal:241)
std::rt::lang_start_internal (#std::rt::lang_start_internal:241)
std::rt::lang_start (#std::rt::lang_start:13)
main (#main:10)
___lldb_unnamed_symbol3139 (#___lldb_unnamed_symbol3139:29)
__libc_start_main (#__libc_start_main:43)
_start (#_start:15)
Thread2 callstack:
epoll_wait (#epoll_wait:27)
mio::sys::unix::selector::epoll::Selector::select (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/mio-0.8.4/src/sys/unix/selector/epoll.rs:68)
mio::poll::Poll::poll (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/mio-0.8.4/src/poll.rs:400)
tokio::runtime::io::Driver::turn (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/runtime/io/mod.rs:162)
<tokio::runtime::io::Driver as tokio::park::Park>::park (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/runtime/io/mod.rs:227)
<tokio::park::either::Either<A,B> as tokio::park::Park>::park (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/park/either.rs:30)
tokio::time::driver::Driver<P>::park_internal (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/time/driver/mod.rs:238)
<tokio::time::driver::Driver<P> as tokio::park::Park>::park (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/time/driver/mod.rs:436)
<tokio::park::either::Either<A,B> as tokio::park::Park>::park (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/park/either.rs:30)
<tokio::runtime::driver::Driver as tokio::park::Park>::park (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/runtime/driver.rs:198)
tokio::runtime::scheduler::current_thread::Context::park::{{closure}} (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/runtime/scheduler/current_thread.rs:308)
tokio::runtime::scheduler::current_thread::Context::enter (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/runtime/scheduler/current_thread.rs:349)
tokio::runtime::scheduler::current_thread::Context::park (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/runtime/scheduler/current_thread.rs:307)
tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}} (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/runtime/scheduler/current_thread.rs:554)
tokio::runtime::scheduler::current_thread::CoreGuard::enter::{{closure}} (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/runtime/scheduler/current_thread.rs:595)
tokio::macros::scoped_tls::ScopedKey<T>::set (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/macros/scoped_tls.rs:61)
tokio::runtime::scheduler::current_thread::CoreGuard::enter (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/runtime/scheduler/current_thread.rs:595)
tokio::runtime::scheduler::current_thread::CoreGuard::block_on (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/runtime/scheduler/current_thread.rs:515)
tokio::runtime::scheduler::current_thread::CurrentThread::block_on (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/runtime/scheduler/current_thread.rs:161)
tokio::runtime::Runtime::block_on (/home/ak_lo/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/runtime/mod.rs:490)
While stepping through, I noticed that the Data struct is polled twice, and it seems like the context and state might not match up (it should close with nothing to return on the second poll, I've confirmed all data is returned in the first). Does anyone have any idea why the program continues to wait for new data when it certainly won't come?
Edit: Just to test I changed the line that causes the freeze to the following:
let mut body = BodyStream::from_request(req).await.map_err(|_| serv_err)?.take(1);
let buffer = {
let mut buf = Vec::new();
while let Some(chunk) = body.next().await {
let data = chunk.map_err(|_| serv_err)?;
buf.extend(data);
}
buf
};
The test is successful in this case. But if I increase take to anymore than 1, the same issue reoccurs.

I was wrong about my assumption of the request contents being unrelated. I got the content length incorrect when creating the test, and it looks like axum was waiting forever as a result.

Related

How can I share a Vector between 2 threads?

I am pretty new to Rust, and cannot manage to keep both Arcs values updated in both threads I'm spawning. The idea would be that one thread loops over received events and when it receives one, updates the object, which the other thread constantly watches. How can I achieve that in Rust, or if this method isn't adequate, would there be a better way to do it ?
(The concrete idea would be one thread listening for MIDI events and the other one re-rendering on a LED strip the notes received)
Here's what I currently have:
main.rs
mod functions;
mod structs;
use crate::functions::*;
use crate::structs::*;
use portmidi as pm;
use rs_ws281x::{ChannelBuilder, ControllerBuilder, StripType};
use std::sync::{Arc, Mutex};
use std::{fs, thread, time};
const MIDI_TIMEOUT: u64 = 10;
const MIDI_CHANNEL: usize = 0;
#[tokio::main]
async fn main() {
let config: Arc<std::sync::Mutex<Config>> = Arc::new(Mutex::new(
toml::from_str(&fs::read_to_string("config.toml").unwrap()).unwrap(),
));
let config_midi = config.clone();
let config_leds = config.clone();
let leds_status = Arc::new(Mutex::new(vec![0; config.lock().unwrap().leds.num_leds]));
let leds_status_midi = Arc::clone(&leds_status);
let leds_status_leds = Arc::clone(&leds_status);
thread::spawn(move || {
let config = config_midi.lock().unwrap();
let midi_context = pm::PortMidi::new().unwrap();
let device_info = midi_context
.device(config.midi.id)
.expect(format!("Could not find device with id {}", config.midi.id).as_str());
println!("Using device {}) {}", device_info.id(), device_info.name());
let input_port = midi_context
.input_port(device_info, config.midi.buffer_size)
.expect("Could not create input port");
let mut leds_status = leds_status_midi.lock().unwrap();
loop {
if let Ok(_) = input_port.poll() {
if let Ok(Some(events)) = input_port.read_n(config.midi.buffer_size) {
for event in events {
let event_type =
get_midi_event_type(event.message.status, event.message.data2);
match event_type {
MidiEventType::NoteOn => {
let key = get_note_position(event.message.data1, &config);
leds_status[key] = 1;
}
MidiEventType::NoteOff => {
let key = get_note_position(event.message.data1, &config);
leds_status[key] = 0;
}
_ => {}
}
}
}
}
thread::sleep(time::Duration::from_millis(MIDI_TIMEOUT));
}
});
thread::spawn(move || {
let config = config_leds.lock().unwrap();
let mut led_controller = ControllerBuilder::new()
.freq(800_000)
.dma(10)
.channel(
MIDI_CHANNEL,
ChannelBuilder::new()
.pin(config.leds.pin)
.count(config.leds.num_leds as i32)
.strip_type(StripType::Ws2812)
.brightness(config.leds.brightness)
.build(),
)
.build()
.unwrap();
loop {
let leds_status = leds_status_leds.lock().unwrap();
print!("\x1b[2J\x1b[1;1H");
println!(
"{:?}",
leds_status.iter().filter(|x| (**x) > 0).collect::<Vec<_>>()
);
}
});
}
functions.rs
use crate::structs::MidiEventType;
pub fn get_note_position(note: u8, config: &crate::structs::Config) -> usize {
let mut note_offset = 0;
for i in 0..config.leds.offsets.len() {
if note > config.leds.offsets[i][0] {
note_offset = config.leds.offsets[i][1];
break;
}
}
note_offset -= config.leds.shift;
let note_pos_raw = 2 * (note - 20) - note_offset;
config.leds.num_leds - (note_pos_raw as usize)
}
pub fn get_midi_event_type(status: u8, velocity: u8) -> MidiEventType {
if status == 144 && velocity > 0 {
MidiEventType::NoteOn
} else if status == 128 || (status == 144 && velocity == 0) {
MidiEventType::NoteOff
} else {
MidiEventType::ControlChange
}
}
structs.rs
use serde_derive::Deserialize;
#[derive(Deserialize, Debug)]
pub struct Config {
pub leds: LedsConfig,
pub midi: MidiConfig,
}
#[derive(Deserialize, Debug)]
pub struct LedsConfig {
pub pin: i32,
pub num_leds: usize,
pub brightness: u8,
pub offsets: Vec<Vec<u8>>,
pub shift: u8,
pub fade: i8,
}
#[derive(Deserialize, Debug)]
pub struct MidiConfig {
pub id: i32,
pub buffer_size: usize,
}
#[derive(Debug)]
pub enum MidiEventType {
NoteOn,
NoteOff,
ControlChange,
}
Thank you very much !
The idea would be that one thread loops over received events and when it receives one, updates the object, which the other thread constantly watches.
That's a good way to do it, particularly if one of the threads needs to be near-realtime (e.g. live audio processing). You can use channels to achieve this. You transfer the sender to one thread and the receiver to another. In a realtime scenario, the receiver can loop until try_recv errs with Empty (limiting to some number of iterations to prevent starvation of the processing code). For example, something like this, given a r: Receiver:
// Process 100 messages max to not starve the thread of the other stuff
// it needs to be doing.
for _ in 0..100 {
match r.try_recv() {
Ok(msg) => { /* Process msg, applying it to the current state */ },
Err(TryRecvError::Empty) => break,
Err(TryRecvError::Disconnected) => {
// The sender is gone, maybe this is our signal to terminate?
return;
},
}
}
Alternatively, if one thread needs to act only when a message is received, it can simply iterate the receiver, which will continue to loop as long as messages are received and the channel is open:
for msg in r {
// Handle the message
}
It really is that simple. If the channel is empty but there are senders alive, it will block until a message is received. Once all senders are gone and the channel is empty, the loop will terminate.
A channel can convey messages of exactly one type; if only one kind of message needs to be sent, you can use a struct. Otherwise, an enum with variants for each kind of message works well.
Given the sending side of the channel, s: Sender, you just s.send(your_message_value).
Another option would be to create an Arc<Mutex<_>>, which it looks like you are doing in your sample code. This way is fine if the lock contention is not too high, but this can inhibit the ability of both threads to run concurrently, which is often the goal of multithreading. Channels tend to work better in message-passing scenarios because there isn't a need for a mutual exclusion lock.
As a side note, you are using Tokio with an async main(), but you never actually do anything with any futures, so there's no reason to even use Tokio in this code.

How to idiomatically share data between closures with wasm-bindgen?

In my browser application, two closures access data stored in a Rc<RefCell<T>>. One closure mutably borrows the data, while the other immutably borrows it. The two closures are invoked independently of one another, and this will occasionally result in a BorrowError or BorrowMutError.
Here is my attempt at an MWE, though it uses a future to artificially inflate the likelihood of the error occurring:
use std::cell::RefCell;
use std::future::Future;
use std::pin::Pin;
use std::rc::Rc;
use std::task::{Context, Poll, Waker};
use wasm_bindgen::prelude::*;
use wasm_bindgen::JsValue;
#[wasm_bindgen]
extern "C" {
#[wasm_bindgen(js_namespace = console)]
pub fn log(s: &str);
#[wasm_bindgen(js_name = setTimeout)]
fn set_timeout(closure: &Closure<dyn FnMut()>, millis: u32) -> i32;
#[wasm_bindgen(js_name = setInterval)]
fn set_interval(closure: &Closure<dyn FnMut()>, millis: u32) -> i32;
}
pub struct Counter(u32);
#[wasm_bindgen(start)]
pub async fn main() -> Result<(), JsValue> {
console_error_panic_hook::set_once();
let counter = Rc::new(RefCell::new(Counter(0)));
let counter_clone = counter.clone();
let log_closure = Closure::wrap(Box::new(move || {
let c = counter_clone.borrow();
log(&c.0.to_string());
}) as Box<dyn FnMut()>);
set_interval(&log_closure, 1000);
log_closure.forget();
let counter_clone = counter.clone();
let increment_closure = Closure::wrap(Box::new(move || {
let counter_clone = counter_clone.clone();
wasm_bindgen_futures::spawn_local(async move {
let mut c = counter_clone.borrow_mut();
// In reality this future would be replaced by some other
// time-consuming operation manipulating the borrowed data
SleepFuture::new(5000).await;
c.0 += 1;
});
}) as Box<dyn FnMut()>);
set_timeout(&increment_closure, 3000);
increment_closure.forget();
Ok(())
}
struct SleepSharedState {
waker: Option<Waker>,
completed: bool,
closure: Option<Closure<dyn FnMut()>>,
}
struct SleepFuture {
shared_state: Rc<RefCell<SleepSharedState>>,
}
impl Future for SleepFuture {
type Output = ();
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
let mut shared_state = self.shared_state.borrow_mut();
if shared_state.completed {
Poll::Ready(())
} else {
shared_state.waker = Some(cx.waker().clone());
Poll::Pending
}
}
}
impl SleepFuture {
fn new(duration: u32) -> Self {
let shared_state = Rc::new(RefCell::new(SleepSharedState {
waker: None,
completed: false,
closure: None,
}));
let state_clone = shared_state.clone();
let closure = Closure::wrap(Box::new(move || {
let mut state = state_clone.borrow_mut();
state.completed = true;
if let Some(waker) = state.waker.take() {
waker.wake();
}
}) as Box<dyn FnMut()>);
set_timeout(&closure, duration);
shared_state.borrow_mut().closure = Some(closure);
SleepFuture { shared_state }
}
}
panicked at 'already mutably borrowed: BorrowError'
The error makes sense, but how should I go about resolving it?
My current solution is to have the closures use try_borrow or try_borrow_mut, and if unsuccessful, use setTimeout for an arbitrary amount of time before attempting to borrow again.
Think about this problem independently of Rust's borrow semantics. You have a long-running operation that's updating some shared state.
How would you do it if you were using threads? You would put the shared state behind a lock. RefCell is like a lock except that you can't block on unlocking it — but you can emulate blocking by using some kind of message-passing to wake up the reader.
How would you do it if you were using pure JavaScript? You don't automatically have anything like RefCell, so either:
The state can be safely read while the operation is still ongoing (in a concurrency-not-parallelism sense): in this case, emulate that by not holding a single RefMut (result of borrow_mut()) alive across an await boundary.
The state is not safe to be read: you'd either write something lock-like as described above, or perhaps arrange so that it's only written once when the operation is done, and until then, the long-running operation has its own private state not shared with the rest of the application (so there can be no BorrowError conflicts).
Think about what your application actually needs and pick a suitable solution. Implementing any of these solutions will most likely involve having additional interior-mutable objects used for communication.

Verify that method was called the expected number of times with correct parameters [duplicate]

This question already has answers here:
How to check if a function has been called in Rust?
(2 answers)
How to mock specific methods but not all of them in Rust?
(2 answers)
Closed 3 years ago.
Summary
I have an application that starts another process and transfers its StdOut/StdErr to a log file using the log crate. My application transfers the output line by line (buf_read.read_line()). As it can be any arbitrary process, my application makes the assumption that the other process may be malicious and may try to print to stdout/sterr enormous amounts of data without a single newline, thus causing OOM in my application. Hence my application limits the number of bytes the BufReader can read at a time using BufReader.take().
The problem
Ignoring all the details about chunking the input, how can I test that my logger was called X times with the correct parameters ? Let's assume my app has read one huge line and has split it in 3 parts like the MCVE below.
MCVE:
use std::thread::JoinHandle;
fn main() {
let handle = start_transfer_thread(&|x| {
println!("X={}", x);
}).join();
}
fn start_transfer_thread<F>(logger: &'static F) -> JoinHandle<()> where F: Send + Sync + Fn(&str) -> () {
std::thread::spawn(move || {
logger("1");
logger("2");
logger("3");
})
}
#[cfg(test)]
mod test {
use super::*;
#[test]
fn test_logged_in_order() {
let result = start_transfer_thread(&|x| {
match x {
"1" => (),
"2" => (),
"3" => (),
x => panic!("unexpected token: {}", x)
}
}).join();
assert!(result.is_ok());
}
}
I was able to do this by replacing the function/closure with a trait object:
trait Logger: Send + Sync {
fn log(&mut self, log_name: &str, data: &str);
}
struct StandardLogger;
impl Logger for StandardLogger {
fn log(&mut self, log_name: &str, data: &str) {
log::logger().log(
&log::Record::builder()
.level(log::Level::Info)
.target(log_name)
.args(format_args!("{}", data))
.build(),
);
}
}
For the tests I use another implementation:
struct DummyLogger {
tx: Mutex<Sender<String>>,
}
impl DummyLogger {
pub fn new() -> (DummyLogger, Receiver<String>) {
let (tx, rx) = std::sync::mpsc::channel();
let logger = DummyLogger { tx: Mutex::new(tx) };
(logger, rx)
}
}
impl Logger for DummyLogger {
fn log(&mut self, log_name: &str, data: &str) {
let tx = self.tx.lock().unwrap();
tx.send(data.to_owned());
}
}
Which allows me to verify that it was both called the correct number of times, with the correct parameters:
let actual: Vec<String> = rx.iter().collect();
assert_eq!(actual, vec!["1", "2", "3", "4"]);

How to add special NotReady logic to tokio-io?

I'm trying to make a Stream that would wait until a specific character is in buffer. I know there's read_until() on BufRead but I actually need a custom solution, as this is a stepping stone to implement waiting until a specific string in in buffer (or, for example, a regexp match happens).
In my project where I first encountered the problem, problem was that future processing just hanged when I get a Ready(_) from inner future and return NotReady from my function. I discovered I shouldn't do that per docs (last paragraph). However, what I didn't get, is what's the actual alternative that is promised in that paragraph. I read all the published documentation on the Tokio site and it doesn't make sense for me at the moment.
So following is my current code. Unfortunately I couldn't make it simpler and smaller as it's already broken. Current result is this:
Err(Custom { kind: Other, error: Error(Shutdown) })
Err(Custom { kind: Other, error: Error(Shutdown) })
Err(Custom { kind: Other, error: Error(Shutdown) })
<ad infinum>
Expected result is getting some Ok(Ready(_)) out of it, while printing W and W', and waiting for specific character in buffer.
extern crate futures;
extern crate tokio_core;
extern crate tokio_io;
extern crate tokio_io_timeout;
extern crate tokio_process;
use futures::stream::poll_fn;
use futures::{Async, Poll, Stream};
use tokio_core::reactor::Core;
use tokio_io::AsyncRead;
use tokio_io_timeout::TimeoutReader;
use tokio_process::CommandExt;
use std::process::{Command, Stdio};
use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;
struct Process {
child: tokio_process::Child,
stdout: Arc<Mutex<tokio_io_timeout::TimeoutReader<tokio_process::ChildStdout>>>,
}
impl Process {
fn new(
command: &str,
reader_timeout: Option<Duration>,
core: &tokio_core::reactor::Core,
) -> Self {
let mut cmd = Command::new(command);
let cat = cmd.stdout(Stdio::piped());
let mut child = cat.spawn_async(&core.handle()).unwrap();
let stdout = child.stdout().take().unwrap();
let mut timeout_reader = TimeoutReader::new(stdout);
timeout_reader.set_timeout(reader_timeout);
let timeout_reader = Arc::new(Mutex::new(timeout_reader));
Self {
child,
stdout: timeout_reader,
}
}
}
fn work() -> Result<(), ()> {
let window = Arc::new(Mutex::new(Vec::new()));
let mut core = Core::new().unwrap();
let process = Process::new("cat", Some(Duration::from_secs(20)), &core);
let mark = Arc::new(Mutex::new(b'c'));
let read_until_stream = poll_fn({
let window = window.clone();
let timeout_reader = process.stdout.clone();
move || -> Poll<Option<u8>, std::io::Error> {
let mut buf = [0; 8];
let poll;
{
let mut timeout_reader = timeout_reader.lock().unwrap();
poll = timeout_reader.poll_read(&mut buf);
}
match poll {
Ok(Async::Ready(0)) => Ok(Async::Ready(None)),
Ok(Async::Ready(x)) => {
{
let mut window = window.lock().unwrap();
println!("W: {:?}", *window);
println!("buf: {:?}", &buf[0..x]);
window.extend(buf[0..x].into_iter().map(|x| *x));
println!("W': {:?}", *window);
if let Some(_) = window.iter().find(|c| **c == *mark.lock().unwrap()) {
Ok(Async::Ready(Some(1)))
} else {
Ok(Async::NotReady)
}
}
}
Ok(Async::NotReady) => Ok(Async::NotReady),
Err(e) => Err(e),
}
}
});
let _stream_thread = thread::spawn(move || {
for o in read_until_stream.wait() {
println!("{:?}", o);
}
});
match core.run(process.child) {
Ok(_) => {}
Err(e) => {
println!("Child error: {:?}", e);
}
}
Ok(())
}
fn main() {
work().unwrap();
}
This is complete example project.
If you need more data you need to call poll_read again until you either find what you were looking for or poll_read returns NotReady.
You might want to avoid looping in one task for too long, so you can build yourself a yield_task function to call instead if poll_read didn't return NotReady; it makes sure your task gets called again ASAP after other pending tasks were run.
To use it just run return yield_task();.
fn yield_inner() {
use futures::task;
task::current().notify();
}
#[inline(always)]
pub fn yield_task<T, E>() -> Poll<T, E> {
yield_inner();
Ok(Async::NotReady)
}
Also see futures-rs#354: Handle long-running, always-ready futures fairly #354.
With the new async/await API futures::task::current is gone; instead you'll need a std::task::Context reference, which is provided as parameter to the new std::future::Future::poll trait method.
If you're already manually implementing the std::future::Future trait you can simply insert:
context.waker().wake_by_ref();
return std::task::Poll::Pending;
Or build yourself a Future-implementing type that yields exactly once:
pub struct Yield {
ready: bool,
}
impl core::future::Future for Yield {
type Output = ();
fn poll(self: core::pin::Pin<&mut Self>, cx: &mut core::task::Context<'_>) -> core::task::Poll<Self::Output> {
let this = self.get_mut();
if this.ready {
core::task::Poll::Ready(())
} else {
cx.waker().wake_by_ref();
this.ready = true; // ready next round
core::task::Poll::Pending
}
}
}
pub fn yield_task() -> Yield {
Yield { ready: false }
}
And then use it in async code like this:
yield_task().await;

How to implement a long running process with progress in Rust, available via a Rest api?

I am a beginner in Rust.
I have a long running IO-bound process that I want to spawn and monitor via a REST API. I chose Iron for that, following this tutorial . Monitoring means getting its progress and its final result.
When I spawn it, I give it an id and map that id to a resource that I can GET to get the progress. I don't have to be exact with the progress; I can report the progress from 5 seconds ago.
My first attempt was to have a channel via which I send request for progress and receive the status. I got stuck where to store the receiver, as in my understanding it belongs to one thread only. I wanted to put it in the context of the request, but that won't work as there are different threads handling subsequent requests.
What would be the idiomatic way to do this in Rust?
I have a sample project.
Later edit:
Here is a self contained example which follows the sample principle as the answer, namely a map where each thread updates its progress:
extern crate iron;
extern crate router;
extern crate rustc_serialize;
use iron::prelude::*;
use iron::status;
use router::Router;
use rustc_serialize::json;
use std::io::Read;
use std::sync::{Mutex, Arc};
use std::thread;
use std::time::Duration;
use std::collections::HashMap;
#[derive(Debug, Clone, RustcEncodable, RustcDecodable)]
pub struct Status {
pub progress: u64,
pub context: String
}
#[derive(RustcEncodable, RustcDecodable)]
struct StartTask {
id: u64
}
fn start_process(status: Arc<Mutex<HashMap<u64, Status>>>, task_id: u64) {
let c = status.clone();
thread::spawn(move || {
for i in 1..100 {
{
let m = &mut c.lock().unwrap();
m.insert(task_id, Status{ progress: i, context: "in progress".to_string()});
}
thread::sleep(Duration::from_secs(1));
}
let m = &mut c.lock().unwrap();
m.insert(task_id, Status{ progress: 100, context: "done".to_string()});
});
}
fn main() {
let status: Arc<Mutex<HashMap<u64, Status>>> = Arc::new(Mutex::new(HashMap::new()));
let status_clone: Arc<Mutex<HashMap<u64, Status>>> = status.clone();
let mut router = Router::new();
router.get("/:taskId", move |r: &mut Request| task_status(r, &status.lock().unwrap()));
router.post("/start", move |r: &mut Request|
start_task(r, status_clone.clone()));
fn task_status(req: &mut Request, statuses: & HashMap<u64,Status>) -> IronResult<Response> {
let ref task_id = req.extensions.get::<Router>().unwrap().find("taskId").unwrap_or("/").parse::<u64>().unwrap();
let payload = json::encode(&statuses.get(&task_id)).unwrap();
Ok(Response::with((status::Ok, payload)))
}
// Receive a message by POST and play it back.
fn start_task(request: &mut Request, statuses: Arc<Mutex<HashMap<u64, Status>>>) -> IronResult<Response> {
let mut payload = String::new();
request.body.read_to_string(&mut payload).unwrap();
let task_start_request: StartTask = json::decode(&payload).unwrap();
start_process(statuses, task_start_request.id);
Ok(Response::with((status::Ok, json::encode(&task_start_request).unwrap())))
}
Iron::new(router).http("localhost:3000").unwrap();
}
One possibility is to use a global HashMap that associate each worker id with the progress (and result). Here is simple example (without the rest stuff)
#[macro_use]
extern crate lazy_static;
use std::sync::Mutex;
use std::collections::HashMap;
use std::thread;
use std::time::Duration;
lazy_static! {
static ref PROGRESS: Mutex<HashMap<usize, usize>> = Mutex::new(HashMap::new());
}
fn set_progress(id: usize, progress: usize) {
// insert replaces the old value if there was one.
PROGRESS.lock().unwrap().insert(id, progress);
}
fn get_progress(id: usize) -> Option<usize> {
PROGRESS.lock().unwrap().get(&id).cloned()
}
fn work(id: usize) {
println!("Creating {}", id);
set_progress(id, 0);
for i in 0..100 {
set_progress(id, i + 1);
// simulates work
thread::sleep(Duration::new(0, 50_000_000));
}
}
fn monitor(id: usize) {
loop {
if let Some(p) = get_progress(id) {
if p == 100 {
println!("Done {}", id);
// to avoid leaks, remove id from PROGRESS.
// maybe save that the task ends in a data base.
return
} else {
println!("Progress {}: {}", id, p);
}
}
thread::sleep(Duration::new(1, 0));
}
}
fn main() {
let w = thread::spawn(|| work(1));
let m = thread::spawn(|| monitor(1));
w.join().unwrap();
m.join().unwrap();
}
You need to register one channel per request thread, because if cloning Receivers were possible the responses might/will end up with the wrong thread if two request are running at the same time.
Instead of having your thread create a channel for answering requests, use a future. A future allows you to have a handle to an object, where the object doesn't exist yet. You can change the input channel to receive a Promise, which you then fulfill, no output channel necessary.

Resources