How can I read non-blocking from stdin? - rust

Is there a way to check whether data is available on stdin in Rust, or to do a read that returns immediately with the currently available data?
My goal is to be able to read the input produced for instance by cursor keys in a shell that is setup to return all read data immediately. For instance with an equivalent to: stty -echo -echok -icanon min 1 time 0.
I suppose one solution would be to use ncurses or similar libraries, but I would like to avoid any kind of large dependencies.
So far, I got only blocking input, which is not what I want:
let mut reader = stdin();
let mut s = String::new();
match reader.read_to_string(&mut s) {...} // this blocks :(

Converting OP's comment into an answer:
You can spawn a thread and send data over a channel. You can then poll that channel in the main thread using try_recv.
use std::io;
use std::sync::mpsc;
use std::sync::mpsc::Receiver;
use std::sync::mpsc::TryRecvError;
use std::{thread, time};
fn main() {
let stdin_channel = spawn_stdin_channel();
loop {
match stdin_channel.try_recv() {
Ok(key) => println!("Received: {}", key),
Err(TryRecvError::Empty) => println!("Channel empty"),
Err(TryRecvError::Disconnected) => panic!("Channel disconnected"),
}
sleep(1000);
}
}
fn spawn_stdin_channel() -> Receiver<String> {
let (tx, rx) = mpsc::channel::<String>();
thread::spawn(move || loop {
let mut buffer = String::new();
io::stdin().read_line(&mut buffer).unwrap();
tx.send(buffer).unwrap();
});
rx
}
fn sleep(millis: u64) {
let duration = time::Duration::from_millis(millis);
thread::sleep(duration);
}

Most operating systems default to work with the standard input and output in a blocking way. No wonder then that the Rust library follows in stead.
To read from a blocking stream in a non-blocking way you might create a separate thread, so that the extra thread blocks instead of the main one. Checking whether a blocking file descriptor produced some input is similar: spawn a thread, make it read the data, check whether it produced any data so far.
Here's a piece of code that I use with a similar goal of processing a pipe output interactively and that can hopefully serve as an example. It sends the data over a channel, which supports the try_recv method - allowing you to check whether the data is available or not.
Someone has told me that mio might be used to read from a pipe in a non-blocking way, so you might want to check it out too. I suspect that passing the stdin file descriptor (0) to Receiver::from_raw_fd should just work.

You could also potentially look at using ncurses (also on crates.io) which would allow you read in raw mode. There are a few examples in the Github repository which show how to do this.

Related

How to create future that completes once tokio::process::Child has exited without closing stdin

How can I create a future that completes upon the termination of a tokio::process::Child without closing stdin. I know there is try_wait for testing if a process has terminated without closing stdin, but I want to have this behavior with future semantics.
I tried to prepare a MRE for this question where my code panics as a result of writing to stdin after calling wait, but what I observe does not match the behavior stated in the documentation for tokio::process::Child's wait method. I would expect to see that the line stdin.write_u8(24).await.unwrap(); crashes with a broken pipe since stdin should have been closed by wait.
use tokio::{time, io::AsyncWriteExt}; // 1.0.1
use std::time::Duration;
#[tokio::main]
pub async fn main() {
let mut child = tokio::process::Command::new("nano")
.stdin(std::process::Stdio::piped())
.spawn()
.unwrap();
let mut stdin = child.stdin.take().unwrap();
let tasklet = tokio::spawn(async move {
child.wait().await
});
// this delay should give tokio::spawn plenty of time to spin up
// and call `wait` on the child (closing stdin)
time::sleep(Duration::from_millis(1000)).await;
// write character 24 (CANcel, ^X) to stdin to close nano
stdin.write_u8(24).await.unwrap();
match tasklet.await {
Ok(exit_result) => match exit_result {
Ok(exit_status) => eprintln!("exit_status: {}", exit_status),
Err(terminate_error) => eprintln!("terminate_error: {}", terminate_error)
}
Err(join_error) => eprintln!("join_error: {}", join_error)
}
}
So the answer to this question is to Option::take ChildStdin out of tokio::process::Child as described in this Github issue. In this case, wait will not close stdin and the programmer is responsible for not causing deadlocks.
The MRE above doesn't fail for two reasons: (i) I took ChildStdin out of tokio::process::Child and (ii) even if I hadn't taken it out, it still would not have been closed due to a bug in the code that will be fixed in this pull request.

How to exchange data in a block_on section?

I'm learning Rust and Tokio and I suspect I may be going in the wrong direction.
I'm trying to open a connection to a remote server and perform a handshake. I want to use non-blocking IO so I'm using Tokio's thread pool. The handshake needs to be performed quickly or the remote will close the socket so I'm trying to chain the message exchange in a single block_on section:
let result: Result<(), Box<dyn std::error::Error>> = session
.runtime()
.borrow_mut()
.block_on(async {
let startup = startup(session.configuration());
stream.write_all(startup.as_ref()).await?;
let mut buffer:Vec<u8> = Vec::new();
let mut tmp = [0u8; 1];
loop {
let total = stream.read(&mut tmp).await;
/*
if total == 0 {
break;
}
*/
if total.is_err() {
break;
}
buffer.extend(&tmp);
}
Ok(())
});
My problem is what to do when there are no more bytes in the socket to read. My current implementation reads the response and after the last byte hangs, I believe because the socket is not closed. I thought checking for 0 bytes read would be enough but the call to read() never returns.
What's the best way to handle this?
From your comment:
Nope, the connection is meant to remain open.
If you read from an open connection, the read will block until there are enough bytes to satisfy it or the other end closes the connection, similar to how blocking reads work in C. Tokio is working as-intended.
If closing the stream does not signal the end of a message, then you will have to do your own work to figure out when to stop reading and start processing. A simple way would to just prefix the request with a length, and only read that many bytes.
Note that you'd have to do the above no matter what API you'd use. The fact that you use tokio or not doesn't really answer the fundamental question of "when is the message over".

Runtime returning without completing all tasks despite using block_on in Tokio

I am writing a multi-threaded concurrent Kafka producer using Rust and Tokio. The project has 2 modes, an interactive mode that runs in an infinite loop and a file mode which takes a file as an argument and then reads the file and sends these messages to Kafka via multiple threads. Interactive mode works fine! but file mode has issues.
To achieve this, I had initially started with Rayon, but then switched to a more flexible runtime; tokio. Now, I am able to parallelize the task of sending data over a specified number of threads within tokio, however, it seems that runtime is getting dropped before all messages are produced. Here is my code:
pub fn worker(brokers: String, f: File, t: usize, topic: Arc<String>) {
let reader = BufReader::new(f);
let mut rt = runtime::Builder::new()
.threaded_scheduler()
.core_threads(t)
.build()
.unwrap();
let producers: Arc<Vec<Mutex<BaseProducer>>> = Arc::new(
(0..t)
.map(|_| get_producer(&brokers))
.collect::<Vec<Mutex<BaseProducer>>>(),
);
let acounter = atomic::AtomicUsize::new(0);
let _results: Vec<_> = reader
.lines()
.map(|line| line.unwrap())
.map(move |line| {
let prods = producers.clone();
let tp = topic.clone();
let cnt = acounter.swap(
(acounter.load(atomic::Ordering::SeqCst) + 1) % t,
atomic::Ordering::SeqCst,
);
rt.block_on(async move {
match prods[cnt]
.lock()
.unwrap()
.send(BaseRecord::to(&(*tp)).payload(&line).key(""))
{
Ok(_) => (),
Err(e) => eprintln!("{:?}", e),
};
})
})
.collect();
}
fn get_producer(brokers: &String) -> Mutex<BaseProducer> {
Mutex::new(
BaseProducer::from_config(
ClientConfig::new()
.set("bootstrap.servers", &brokers)
.set("message.timeout.ms", "5000"),
)
.expect("Producer creation error"),
)
}
As a high-level walkthrough: I create mutable producers equal to the number of threads specified and every task within this thread will use one of these producers. The file is read line by line sequentially and every line is moved into the closure that produces it as a message to Kafka.
The code works fine, for the most part, but there are issues related to the runtime exiting without completing all tasks, even when I am using the block_on function in the runtime. Which is supposed to block until the future is complete (Async block in my case here).
I believe the issue is that the issue is with runtime getting dropped without all the threading within Tokio exiting successfully.
I tried reading a file with this approach habing 100,000 records, on a single thread, I was able to produce 28,000 records. On 2 threads, close to 46,000 records. And while utilising all 8 cores of my CPU, I was getting 99,000-100,000 messages indeterministically.
I have checked several answers on SO, but none help in my case. I also read through the documentation of tokio::runtime::Runtime here and tried to use spawn and then use futures::future::join, but that didn't work either.
Any help is appreciated!

How can I stop reading from a tokio::io::lines stream?

I want to terminate reading from a tokio::io::lines stream. I merged it with a oneshot future and terminated it, but tokio::run was still working.
use futures::{sync::oneshot, *}; // 0.1.27
use std::{io::BufReader, time::Duration};
use tokio::prelude::*; // 0.1.21
fn main() {
let (tx, rx) = oneshot::channel::<()>();
let lines = tokio::io::lines(BufReader::new(tokio::io::stdin()));
let lines = lines.for_each(|item| {
println!("> {:?}", item);
Ok(())
});
std::thread::spawn(move || {
std::thread::sleep(Duration::from_millis(5000));
println!("system shutting down");
let _ = tx.send(());
});
let lines = lines.select2(rx);
tokio::run(lines.map(|_| ()).map_err(|_| ()));
}
How can I stop reading from this?
There's nothing wrong with your strategy, but it will only work with futures that don't execute a blocking operation via Tokio's blocking (the traditional kind of blocking should never be done inside a future).
You can test this by replacing the tokio::io::lines(..) future with a simple interval future:
let lines = Interval::new(Instant::now(), Duration::from_secs(1));
The problem is that tokio::io::Stdin internally uses tokio_threadpool::blocking .
When you use Tokio thread pool blocking (emphasis mine):
NB: The entire task that called blocking is blocked whenever the
supplied closure blocks, even if you have used future combinators such
as select - the other futures in this task will not make progress
until the closure returns. If this is not desired, ensure that
blocking runs in its own task (e.g. using
futures::sync::oneshot::spawn).
Since this will block every other future in the combinator, your Receiver will not be able to get a signal from the Senderuntil the blocking ends.
Please see How can I read non-blocking from stdin? or you can use tokio-stdin-stdout, which creates a channel to consume data from stdin thread. It also has a line-by-line example.
Thank you for your comment and correcting my sentences.
I tried to stop this non-blocking Future and succeeded.
let lines = Interval::new(Instant::now(), Duration::from_secs(1));
My understating is that it would work for this case to wrap the blocking Future with tokio threadpool::blocking.
I'll try it later.
Thank you very much.

How to use tokio's UdpSocket to handle messages in a 1 server: N clients setup?

What I want to do:
... write a (1) server/ (N) clients (network-game-)architecture that uses UDP sockets as underlying base for communication.
Messages are sent as Vec<u8>, encoded via bincode (crate)
I also want to be able to occasionally send datagrams that can exceed the typical max MTU of ~1500 bytes and be correctly assembled on receiver end, including sending of ack-messages etc. (I assume I'll have to implement that myself, right?)
For the UdpSocket I thought about using tokio's implementation and maybe framed. I am not sure whether this is a good choice though, as it seems that this would introduce an unnecessary step of mapping Vec<u8> (serialized by bincode) to Vec<u8> (needed by UdpCodec of tokio) (?)
Consider this minimal code-example:
Cargo.toml (server)
bincode = "1.0"
futures = "0.1"
tokio-core = "^0.1"
(Serde and serde-derive are used in shared crate where the protocol is defined!)
(I want to replace tokio-core with tokio asap)
fn main() -> () {
let addr = format!("127.0.0.1:{port}", port = 8080);
let addr = addr.parse::<SocketAddr>().expect(&format!("Couldn't create valid SocketAddress out of {}", addr));
let mut core = Core::new().unwrap();
let handle = core.handle();
let socket = UdpSocket::bind(&addr, &handle).expect(&format!("Couldn't bind socket to address {}", addr));
let udp_future = socket.framed(MyCodec {}).for_each(|(addr, data)| {
socket.send_to(&data, &addr); // Just echo back the data
Ok(())
});
core.run(udp_future).unwrap();
}
struct MyCodec;
impl UdpCodec for MyCodec {
type In = (SocketAddr, Vec<u8>);
type Out = (SocketAddr, Vec<u8>);
fn decode(&mut self, src: &SocketAddr, buf: &[u8]) -> io::Result<Self::In> {
Ok((*src, buf.to_vec()))
}
fn encode(&mut self, msg: Self::Out, buf: &mut Vec<u8>) -> SocketAddr {
let (addr, mut data) = msg;
buf.append(&mut data);
addr
}
}
The problem here is:
let udp_future = socket.framed(MyCodec {}).for_each(|(addr, data)| {
| ------ value moved here ^^^^^^^^^^^^^^ value captured here after move
|
= note: move occurs because socket has type tokio_core::net::UdpSocket, which does not implement the Copy trait
The error makes total sense, yet I am not sure how I would create such a simple echo-service. In reality, the handling of a message involves a bit more logic ofc, but for the sake of a minimal example, this should be enough to give a rough idea.
My workaround is an ugly hack: creating a second socket.
Here's the signature of UdpSocket::framed from Tokio's documentation:
pub fn framed<C: UdpCodec>(self, codec: C) -> UdpFramed<C>
Note that it takes self, not &self; that is, calling this function consumes the socket. The UdpFramed wrapper owns the underlying socket when you call this. Your compilation error is telling you that you're moving socket when you call this method, but you're also trying to borrow socket inside your closure (to call send_to).
This probably isn't what you want for real code. The whole point of using framed() is to turn your socket into something higher-level, so you can send your codec's items directly instead of having to assemble datagrams. Using send or send_to directly on the socket will probably break the framing of your message protocol. In this code, where you're trying to implement a simple echo server, you don't need to use framed at all. But if you do want to have your cake and eat it and use both framed and send_to, luckily UdpFramed still allows you to borrow the underlying UdpSocket, using get_ref. You can fix your problem this way:
let framed = {
let socket = UdpSocket::bind(&addr, &handle).expect(&format!("Couldn't bind socket to address {}", addr));
socket.framed(MyCodec {})
}
let udp_future = framed.for_each(|(addr, data)| {
info!(self.logger, "Udp packet received from {}: length: {}", addr, data.len());
framed.get_ref().send_to(&data, &addr); // Just echo back the data
Ok(())
});
I haven't checked this code, since (as Shepmaster rightly pointed out) your code snippet has other problems, but it should give you the idea anyway. I'll repeat my warning from earlier: if you do this in real code, it will break the network protocol you're using. get_ref's documentation puts it like this:
Note that care should be taken to not tamper with the underlying stream of data coming in as it may corrupt the stream of frames otherwise being worked with.
To answer the new part of your question: yes, you need to handle reassembly yourself, which means your codec does actually need to do some framing on the bytes you're sending. Typically this might involve a start sequence which cannot occur in the Vec<u8>. The start sequence lets you recognise the start of the next message after a packet was lost (which happens a lot with UDP). If there's no byte sequence that can't occur in the Vec<u8>, you need to escape it when it does occur. You might then send the length of the message, followed by the data itself; or just the data, followed by an end sequence and a checksum so you know none was lost. There are pros and cons to these designs, and it's a big topic in itself.
You also need your UdpCodec to contain data: a map from SocketAddr to the partially-reassembled message that's currently in progress. In decode, if you are given the start of a message, copy it into the map and return Ok. If you are given the middle of a message, and you already have the start of a message in the map (for that SocketAddr), append the buffer to the existing buffer and return Ok. When you get to the end of the message, return the whole thing and empty the buffer. The methods on UdpCodec take &mut self in order to enable this use case. (NB In theory, you should also deal with packets arriving out of order, but that's actually quite rare in the real world.)
encode is a lot simpler: you just need to add the same framing and copy the message into the buffer.
Let me reiterate here that you don't need to and shouldn't use the underlying socket after calling framed() on it. UdpFramed is both a source and a sink, so you use that one object to send the replies as well. You can even use split() to get separate Stream and Sink implementations out of it, if that makes the ownership easier in your application.
Overall, now I've seen how much of the problem you're struggling with, I'd recommend just using several TCP sockets instead of UDP. If you want a connection-oriented, reliable protocol, TCP already exists and does that for you. It's very easy to spend a lot of time making a "reliable" layer on top of UDP that is both slower and less reliable than TCP.

Resources