How to convert a Bytes Iterator into a Stream in Rust

How to convert a Bytes Iterator into a Stream in Rust - rust

I'm trying to figure out build a feature which requires reading the contents of a file into a futures::stream::BoxStream but I'm having a tough time figuring out what I need to do.
I have figured out how to read a file byte by byte via Bytes which implements an iterator.
use std::fs::File;
use std::io::prelude::*;
use std::io::{BufReader, Bytes};
// TODO: Convert this to a async Stream
fn async_read() -> Box<dyn Iterator<Item = Result<u8, std::io::Error>>> {
let f = File::open("/dev/random").expect("Could not open file");
let reader = BufReader::new(f);
let iter = reader.bytes().into_iter();
Box::new(iter)
}
fn main() {
ctrlc::set_handler(move || {
println!("received Ctrl+C!");
std::process::exit(0);
})
.expect("Error setting Ctrl-C handler");
for b in async_read().into_iter() {
println!("{:?}", b);
}
}
However, I've been struggling a bunch trying to figure out how I can turn this Box<dyn Iterator<Item = Result<u8, std::io::Error>>> into an Stream.
I would have thought something like this would work:
use futures::stream;
use std::fs::File;
use std::io::prelude::*;
use std::io::{BufReader, Bytes};
// TODO: Convert this to a async Stream
fn async_read() -> stream::BoxStream<'static, dyn Iterator<Item = Result<u8, std::io::Error>>> {
let f = File::open("/dev/random").expect("Could not open file");
let reader = BufReader::new(f);
let iter = reader.bytes().into_iter();
std::pin::Pin::new(Box::new(stream::iter(iter)))
}
fn main() {
ctrlc::set_handler(move || {
println!("received Ctrl+C!");
std::process::exit(0);
})
.expect("Error setting Ctrl-C handler");
while let Some(b) = async_read().poll() {
println!("{:?}", b);
}
}
But I keep getting a ton of compiler errors, I've tried other permutations but generally getting no where.
One of the compiler errors:
std::pin::Pin::new
``` --> src/main.rs:14:24
|
14 | std::pin::Pin::new(Box::new(stream::iter(iter)))
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected trait object `dyn std::iter::Iterator`, found enum `std::result::Result`
Anyone have any advice?
I'm pretty new to Rust, and specifically Streams/lower level stuff so I apologize if I got anything wrong, feel free to correct me.
For some additional background, I'm trying to do this so you can CTRL-C out of a command in nushell

I think you are overcomplicating it a bit, you can just return impl Stream from async_read, there is no need to box or pin (same goes for the original Iterator-based version). Then you need to set up an async runtime in order to poll the stream (in this example I just use the runtime provided by futures::executor::block_on). Then you can call futures::stream::StreamExt::next() on the stream to get a future representing the next item.
Here is one way to do this:
use futures::prelude::*;
use std::{
fs::File,
io::{prelude::*, BufReader},
};
fn async_read() -> impl Stream<Item = Result<u8, std::io::Error>> {
let f = File::open("/dev/random").expect("Could not open file");
let reader = BufReader::new(f);
stream::iter(reader.bytes())
}
async fn async_main() {
while let Some(b) = async_read().next().await {
println!("{:?}", b);
}
}
fn main() {
ctrlc::set_handler(move || {
println!("received Ctrl+C!");
std::process::exit(0);
})
.expect("Error setting Ctrl-C handler");
futures::executor::block_on(async_main());
}

Related

How can I put an async function into a Vec in Rust?

I need to put some futures in a Vec for later joining. However if I try to collect it using an iterator, the compiler doesn't seem to be able to determine the type for the vector.
I'm trying to create a command line utility that accepts an arbitrary number of IP addresses, communicates with those remotes and collects the results for printing. The communication function works well, I've cut down the program to show the failure I need to understand.
use futures::future::join_all;
use itertools::Itertools;
use std::net::SocketAddr;
use std::str::from_utf8;
use std::fmt;
#[tokio::main(flavor = "current_thread")]
pub async fn main() -> Result<(), Box<dyn std::error::Error>> {
let socket: Vec<SocketAddr> = vec![
"192.168.20.33:502".parse().unwrap(),
"192.168.20.34:502".parse().unwrap(),];
let async_vec = vec![
MyStruct::get(socket[0]),
MyStruct::get(socket[1]),];
// The above 3 lines happen to work to build a Vec because there are
// 2 sockets. But I need to build a Vec to join_all from an arbitary
// number of addresses. Why doesn't the line below work instead?
//let async_vec = socket.iter().map(|x| MyStruct::get(*x)).collect();
let rt = join_all(async_vec).await;
let results = rt.iter().map(|x| x.as_ref().unwrap().to_string()).join("\n");
let mut rvec: Vec<String> = results.split("\n").map(|x| x.to_string()).collect();
rvec.sort_by(|a, b| a[15..20].cmp(&b[15..20]));
println!("{}", rvec.join("\n"));
Ok(())
}
struct MyStruct {
serial: [u8; 12],
placeholder: String,
}
impl fmt::Display for MyStruct {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
let serial = match from_utf8(&self.serial) {
Ok(v) => v,
Err(_) => "(invalid)",
};
let lines = (1..4).map(|x| format!("{}, line{}, {}", serial, x, self.placeholder)).join("\n");
write!(f, "{}", lines)
}
}
impl MyStruct {
pub async fn get(sockaddr: SocketAddr) -> Result<MyStruct, Box<dyn std::error::Error>> {
let char = sockaddr.ip().to_string().chars().last().unwrap();
let rv = MyStruct{serial: [char as u8;12], placeholder: sockaddr.to_string(), };
Ok(rv)
}
}

This line:
let async_vec = socket.iter().map(|x| MyStruct::get(*x)).collect();
doesn't work because the compiler can't know that you want to collect everything into a Vec. You might want to collect into some other container (e.g. a linked list or a set). Therefore you need to tell the compiler the kind of container you want with:
let async_vec = socket.iter().map(|x| MyStruct::get(*x)).collect::<Vec::<_>>();
or:
let async_vec: Vec::<_> = socket.iter().map(|x| MyStruct::get(*x)).collect();

how to generalize from `File` to `Read`?

I have some working code that reads a file, but I need to generalize it to pull data from additional sources other than simple disk files.
Is Read the correct generalization I should work with in order to replace File?
If so, how can I fix example2 in the following sample code? As is, it fails with the compile error dyn async_std::io::Read cannot be unpinned at the commented line. If not, what type should I return instead from get_read and are there any corresponding changes required in example2?
//! [dependencies]
//! tokio = { version = "1.0.1", features = ["full"] }
//! async-std = "1.8.0"
//! anyhow = "1.0.32"
use async_std::io::prelude::*;
use async_std::fs::File;
use anyhow::Result;
#[tokio::main]
async fn main() -> Result<()> {
example1().await?;
example2().await?;
Ok(())
}
// Example of consuming `File` ... works great!
async fn example1() -> Result<()> {
let mut file = get_file().await?;
let mut contents = String::new();
let _ = file.read_to_string(&mut contents).await?;
println!("read {} characters", contents.len());
Ok(())
}
// Example of consuming `Read` ... does not compile?
async fn example2() -> Result<()> {
let mut read = get_read().await?;
let mut contents = String::new();
// ERROR: `dyn async_std::io::Read` cannot be unpinned
let _ = read.read_to_string(&mut contents).await?;
println!("read {} characters", contents.len());
Ok(())
}
async fn get_read() -> Result<Box<dyn Read>> {
let file = get_file().await?;
Ok(Box::new(file))
}
async fn get_file() -> Result<File> {
let file = File::open("/etc/hosts").await?;
Ok(file)
}

You need to pin:
async fn get_read() -> Result<Pin<Box<dyn Read>>> {
let file = get_file().await?;
Ok(Box::pin(file))
}
Box<File> (without Pin) works because File implements Unpin. Box<dyn Read + Unpin> would work too.

Why are spawned futures not executed by tokio_core::reactor::Core?

extern crate tokio; // 0.1.8
use tokio::prelude::*;
fn create_a_future(x: u8) -> Box<Future<Item = (), Error = ()>> {
Box::new(futures::future::ok(2).and_then(|a| {
println!("{}", a);
Ok(())
}))
}
fn main() {
let mut eloop = tokio_core::reactor::Core::new().unwrap();
let handle = eloop.handle();
for x in 0..10 {
let f = create_a_future(x);
handle.spawn(f);
}
}
I expect this to print to stdout, but it didn't happen. Am I using spawn in the wrong way?

As already mentioned in the comments, you are setting up a bunch of computation but never running any of it. Like iterators, you can think of futures as lazy. The compiler normally tells you about this when you directly create a future but never use it. Here, you are spawning the futures, so you don't get that warning, but nothing ever drives the Tokio reactor.
In many cases, you have a specific future you want to run, and you'd drive the reactor until that completes. In other cases, your run the reactor "forever", endlessly handling new work.
In this case, you can use Core::turn:
fn main() {
let mut eloop = tokio_core::reactor::Core::new().unwrap();
let handle = eloop.handle();
for x in 0..10 {
let f = create_a_future(x);
handle.spawn(f);
}
eloop.run(None);
}
eloop.turn(None);
-> Box<Future<Item = (), Error = ()>>
You don't need to (and probably shouldn't) do this in modern Rust. It's preferred to return an anonymous type:
fn create_a_future() -> impl Future<Item = (), Error = ()> {
futures::future::ok(2).and_then(|a| {
println!("{}", a);
Ok(())
})
}
tokio_core::reactor::Core
My understanding is that this level of Tokio is reserved for more complicated setups. Many people can just use tokio::run and tokio::spawn:
fn main() {
tokio::run(futures::lazy(|| {
for _ in 0..10 {
tokio::spawn(create_a_future());
}
Ok(())
}))
}

How to add special NotReady logic to tokio-io?

I'm trying to make a Stream that would wait until a specific character is in buffer. I know there's read_until() on BufRead but I actually need a custom solution, as this is a stepping stone to implement waiting until a specific string in in buffer (or, for example, a regexp match happens).
In my project where I first encountered the problem, problem was that future processing just hanged when I get a Ready(_) from inner future and return NotReady from my function. I discovered I shouldn't do that per docs (last paragraph). However, what I didn't get, is what's the actual alternative that is promised in that paragraph. I read all the published documentation on the Tokio site and it doesn't make sense for me at the moment.
So following is my current code. Unfortunately I couldn't make it simpler and smaller as it's already broken. Current result is this:
Err(Custom { kind: Other, error: Error(Shutdown) })
Err(Custom { kind: Other, error: Error(Shutdown) })
Err(Custom { kind: Other, error: Error(Shutdown) })
<ad infinum>
Expected result is getting some Ok(Ready(_)) out of it, while printing W and W', and waiting for specific character in buffer.
extern crate futures;
extern crate tokio_core;
extern crate tokio_io;
extern crate tokio_io_timeout;
extern crate tokio_process;
use futures::stream::poll_fn;
use futures::{Async, Poll, Stream};
use tokio_core::reactor::Core;
use tokio_io::AsyncRead;
use tokio_io_timeout::TimeoutReader;
use tokio_process::CommandExt;
use std::process::{Command, Stdio};
use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;
struct Process {
child: tokio_process::Child,
stdout: Arc<Mutex<tokio_io_timeout::TimeoutReader<tokio_process::ChildStdout>>>,
}
impl Process {
fn new(
command: &str,
reader_timeout: Option<Duration>,
core: &tokio_core::reactor::Core,
) -> Self {
let mut cmd = Command::new(command);
let cat = cmd.stdout(Stdio::piped());
let mut child = cat.spawn_async(&core.handle()).unwrap();
let stdout = child.stdout().take().unwrap();
let mut timeout_reader = TimeoutReader::new(stdout);
timeout_reader.set_timeout(reader_timeout);
let timeout_reader = Arc::new(Mutex::new(timeout_reader));
Self {
child,
stdout: timeout_reader,
}
}
}
fn work() -> Result<(), ()> {
let window = Arc::new(Mutex::new(Vec::new()));
let mut core = Core::new().unwrap();
let process = Process::new("cat", Some(Duration::from_secs(20)), &core);
let mark = Arc::new(Mutex::new(b'c'));
let read_until_stream = poll_fn({
let window = window.clone();
let timeout_reader = process.stdout.clone();
move || -> Poll<Option<u8>, std::io::Error> {
let mut buf = [0; 8];
let poll;
{
let mut timeout_reader = timeout_reader.lock().unwrap();
poll = timeout_reader.poll_read(&mut buf);
}
match poll {
Ok(Async::Ready(0)) => Ok(Async::Ready(None)),
Ok(Async::Ready(x)) => {
{
let mut window = window.lock().unwrap();
println!("W: {:?}", *window);
println!("buf: {:?}", &buf[0..x]);
window.extend(buf[0..x].into_iter().map(|x| *x));
println!("W': {:?}", *window);
if let Some(_) = window.iter().find(|c| **c == *mark.lock().unwrap()) {
Ok(Async::Ready(Some(1)))
} else {
Ok(Async::NotReady)
}
}
}
Ok(Async::NotReady) => Ok(Async::NotReady),
Err(e) => Err(e),
}
}
});
let _stream_thread = thread::spawn(move || {
for o in read_until_stream.wait() {
println!("{:?}", o);
}
});
match core.run(process.child) {
Ok(_) => {}
Err(e) => {
println!("Child error: {:?}", e);
}
}
Ok(())
}
fn main() {
work().unwrap();
}
This is complete example project.

If you need more data you need to call poll_read again until you either find what you were looking for or poll_read returns NotReady.
You might want to avoid looping in one task for too long, so you can build yourself a yield_task function to call instead if poll_read didn't return NotReady; it makes sure your task gets called again ASAP after other pending tasks were run.
To use it just run return yield_task();.
fn yield_inner() {
use futures::task;
task::current().notify();
}
#[inline(always)]
pub fn yield_task<T, E>() -> Poll<T, E> {
yield_inner();
Ok(Async::NotReady)
}
Also see futures-rs#354: Handle long-running, always-ready futures fairly #354.
With the new async/await API futures::task::current is gone; instead you'll need a std::task::Context reference, which is provided as parameter to the new std::future::Future::poll trait method.
If you're already manually implementing the std::future::Future trait you can simply insert:
context.waker().wake_by_ref();
return std::task::Poll::Pending;
Or build yourself a Future-implementing type that yields exactly once:
pub struct Yield {
ready: bool,
}
impl core::future::Future for Yield {
type Output = ();
fn poll(self: core::pin::Pin<&mut Self>, cx: &mut core::task::Context<'_>) -> core::task::Poll<Self::Output> {
let this = self.get_mut();
if this.ready {
core::task::Poll::Ready(())
} else {
cx.waker().wake_by_ref();
this.ready = true; // ready next round
core::task::Poll::Pending
}
}
}
pub fn yield_task() -> Yield {
Yield { ready: false }
}
And then use it in async code like this:
yield_task().await;

How can I pass a socket as an argument to a function being called within a thread?

I'm going to have multiple functions that all need access to one main socket.
Would it better to:
Pass this socket to each function that needs access to it
Have a globally accessible socket
Can someone provide an example of the best way to do this?
I come from a Python/Nim background where things like this are easily done.
Edit:
How can I pass a socket as an arg to a function being called within a thread.
Ex.
fn main() {
let mut s = BufferedStream::new((TcpStream::connect(server).unwrap()));
let thread = Thread::spawn(move || {
func1(s, arg1, arg2);
});
while true {
func2(s, arg1);
}
}

Answer for updated question
We can use TcpStream::try_clone:
use std::io::Read;
use std::net::{TcpStream, Shutdown};
use std::thread;
fn main() {
let mut stream = TcpStream::connect("127.0.0.1:34254").unwrap();
let stream2 = stream.try_clone().unwrap();
let _t = thread::spawn(move || {
// close this stream after one second
thread::sleep_ms(1000);
stream2.shutdown(Shutdown::Read).unwrap();
});
// wait for some data, will get canceled after one second
let mut buf = [0];
stream.read(&mut buf).unwrap();
}
Original answer
It's usually (let's say 99.9% of the time) a bad idea to have any global mutable state, if you can help it. Just do as you said: pass the socket to the functions that need it.
use std::io::{self, Write};
use std::net::TcpStream;
fn send_name(stream: &mut TcpStream) -> io::Result<()> {
stream.write(&[42])?;
Ok(())
}
fn send_number(stream: &mut TcpStream) -> io::Result<()> {
stream.write(&[1, 2, 3])?;
Ok(())
}
fn main() {
let mut stream = TcpStream::connect("127.0.0.1:31337").unwrap();
let r = send_name(&mut stream).and_then(|_| send_number(&mut stream));
match r {
Ok(..) => println!("Yay, sent!"),
Err(e) => println!("Boom! {}", e),
}
}
You could also pass the TcpStream to a struct that manages it, and thus gives you a place to put similar methods.
use std::io::{self, Write};
use std::net::TcpStream;
struct GameService {
stream: TcpStream,
}
impl GameService {
fn send_name(&mut self) -> io::Result<()> {
self.stream.write(&[42])?;
Ok(())
}
fn send_number(&mut self) -> io::Result<()> {
self.stream.write(&[1, 2, 3])?;
Ok(())
}
}
fn main() {
let stream = TcpStream::connect("127.0.0.1:31337").unwrap();
let mut service = GameService { stream: stream };
let r = service.send_name().and_then(|_| service.send_number());
match r {
Ok(..) => println!("Yay, sent!"),
Err(e) => println!("Boom! {}", e),
}
}
None of this is really Rust-specific, these are generally-applicable programming practices.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to convert a Bytes Iterator into a Stream in Rust - rust

Related

How can I put an async function into a Vec in Rust?

how to generalize from `File` to `Read`?

Why are spawned futures not executed by tokio_core::reactor::Core?

How to add special NotReady logic to tokio-io?

How can I pass a socket as an argument to a function being called within a thread?

Categories

Resources