I am trying to use hyper to grab the content of an HTML page and would like to synchronously return the output of a future. I realized I could have picked a better example since synchronous HTTP requests already exist, but I am more interested in understanding whether we could return a value from an async calculation.
extern crate futures;
extern crate hyper;
extern crate hyper_tls;
extern crate tokio;
use futures::{future, Future, Stream};
use hyper::Client;
use hyper::Uri;
use hyper_tls::HttpsConnector;
use std::str;
fn scrap() -> Result<String, String> {
let scraped_content = future::lazy(|| {
let https = HttpsConnector::new(4).unwrap();
let client = Client::builder().build::<_, hyper::Body>(https);
client
.get("https://hyper.rs".parse::<Uri>().unwrap())
.and_then(|res| {
res.into_body().concat2().and_then(|body| {
let s_body: String = str::from_utf8(&body).unwrap().to_string();
futures::future::ok(s_body)
})
}).map_err(|err| format!("Error scraping web page: {:?}", &err))
});
scraped_content.wait()
}
fn read() {
let scraped_content = future::lazy(|| {
let https = HttpsConnector::new(4).unwrap();
let client = Client::builder().build::<_, hyper::Body>(https);
client
.get("https://hyper.rs".parse::<Uri>().unwrap())
.and_then(|res| {
res.into_body().concat2().and_then(|body| {
let s_body: String = str::from_utf8(&body).unwrap().to_string();
println!("Reading body: {}", s_body);
Ok(())
})
}).map_err(|err| {
println!("Error reading webpage: {:?}", &err);
})
});
tokio::run(scraped_content);
}
fn main() {
read();
let content = scrap();
println!("Content = {:?}", &content);
}
The example compiles and the call to read() succeeds, but the call to scrap() panics with the following error message:
Content = Err("Error scraping web page: Error { kind: Execute, cause: None }")
I understand that I failed to launch the task properly before calling .wait() on the future but I couldn't find how to properly do it, assuming it's even possible.
Standard library futures
Let's use this as our minimal, reproducible example:
async fn example() -> i32 {
42
}
Call executor::block_on:
use futures::executor; // 0.3.1
fn main() {
let v = executor::block_on(example());
println!("{}", v);
}
Tokio
Use the tokio::main attribute on any function (not just main!) to convert it from an asynchronous function to a synchronous one:
use tokio; // 0.3.5
#[tokio::main]
async fn main() {
let v = example().await;
println!("{}", v);
}
tokio::main is a macro that transforms this
#[tokio::main]
async fn main() {}
Into this:
fn main() {
tokio::runtime::Builder::new_multi_thread()
.enable_all()
.build()
.unwrap()
.block_on(async { {} })
}
This uses Runtime::block_on under the hood, so you can also write this as:
use tokio::runtime::Runtime; // 0.3.5
fn main() {
let v = Runtime::new().unwrap().block_on(example());
println!("{}", v);
}
For tests, you can use tokio::test.
async-std
Use the async_std::main attribute on the main function to convert it from an asynchronous function to a synchronous one:
use async_std; // 1.6.5, features = ["attributes"]
#[async_std::main]
async fn main() {
let v = example().await;
println!("{}", v);
}
For tests, you can use async_std::test.
Futures 0.1
Let's use this as our minimal, reproducible example:
use futures::{future, Future}; // 0.1.27
fn example() -> impl Future<Item = i32, Error = ()> {
future::ok(42)
}
For simple cases, you only need to call wait:
fn main() {
let s = example().wait();
println!("{:?}", s);
}
However, this comes with a pretty severe warning:
This method is not appropriate to call on event loops or similar I/O situations because it will prevent the event loop from making progress (this blocks the thread). This method should only be called when it's guaranteed that the blocking work associated with this future will be completed by another thread.
Tokio
If you are using Tokio 0.1, you should use Tokio's Runtime::block_on:
use tokio; // 0.1.21
fn main() {
let mut runtime = tokio::runtime::Runtime::new().expect("Unable to create a runtime");
let s = runtime.block_on(example());
println!("{:?}", s);
}
If you peek in the implementation of block_on, it actually sends the future's result down a channel and then calls wait on that channel! This is fine because Tokio guarantees to run the future to completion.
See also:
How can I efficiently extract the first element of a futures::Stream in a blocking manner?
As this is the top result that come up in search engines by the query "How to call async from sync in Rust", I decided to share my solution here. I think it might be useful.
As #Shepmaster mentioned, back in version 0.1 futures crate had beautiful method .wait() that could be used to call an async function from a sync one. This must-have method, however, was removed from later versions of the crate.
Luckily, it's not that hard to re-implement it:
trait Block {
fn wait(self) -> <Self as futures::Future>::Output
where Self: Sized, Self: futures::Future
{
futures::executor::block_on(self)
}
}
impl<F,T> Block for F
where F: futures::Future<Output = T>
{}
After that, you can just do following:
async fn example() -> i32 {
42
}
fn main() {
let s = example().wait();
println!("{:?}", s);
}
Beware that this comes with all the caveats of original .wait() explained in the #Shepmaster's answer.
This works for me using tokio:
tokio::runtime::Runtime::new()?.block_on(fooAsyncFunction())?;
Related
For a GUI tool I'm writing in Rust I'd like to kick off long-lived subprocesses and then repeatedly poll them for output. I don't want to block indefinitely waiting for output from any given one, so I'm using tokio and tokio-process to run the process and timeout the output reading. I need to store the subprocess in a struct as well.
I'm running into problems because it seems I need to consume the subprocess's output stream in order to read from it, so after a single read I'm unable to access it again.
I've included a simplified reproduction of my problem below. It prints actual output for the first invocation of print_output, but then prints No process_output_fut for the second invocation since I had to take the output future out of the Plugin struct and consume it in order to read from it.
Any suggestions for how to refactor this code to avoid consuming the process output future with each attempt to fetch the output?
extern crate tokio;
use std::process::{Command, Stdio};
use std::time::Duration;
use tokio::prelude::*;
use tokio::codec::{FramedRead, LinesCodec};
use tokio::prelude::stream::StreamFuture;
use tokio::timer::Timeout;
use tokio_process::{Child, ChildStdout, CommandExt};
pub struct Plugin {
process: Option<Child>,
process_output_fut: Option<Timeout<StreamFuture<FramedRead<ChildStdout, LinesCodec>>>>,
tokio_runtime: tokio::runtime::Runtime,
}
impl Plugin {
pub fn new() -> Self {
return Plugin {
process: None,
process_output_fut: None,
tokio_runtime: tokio::runtime::Runtime::new().expect(
"Could not create tokio runtime"
),
}
}
pub fn spawn(&mut self, arg: String) {
let mut process = Command::new("/bin/ls")
.arg(arg)
.stdout(Stdio::piped())
.spawn_async()
.expect("Failed to spawn command");
if let Some(plugin_output) = process.stdout().take() {
let lines_codec = LinesCodec::new();
self.process_output_fut = Some(FramedRead::new(plugin_output, lines_codec)
.into_future()
.timeout(Duration::from_millis(3000)));
}
self.process = Some(process);
}
pub fn print_output(&mut self) {
if let Some(process_output_fut) = self.process_output_fut.take() {
let result = self.tokio_runtime.block_on(process_output_fut).unwrap();
if let Some(output) = result.0 {
println!("{:?}", output);
};
} else {
println!("No process_output_fut");
}
}
}
fn main() {
let mut p = Plugin::new();
p.spawn("/");
p.print_output();
p.print_output();
}
This is what I have, but I want to avoid using unwrap on my reqwest values:
extern crate base64;
extern crate reqwest;
use serde_json;
use serde_json::json;
pub fn perform_get(id: String) -> serde_json::value::Value {
let client = reqwest::Client::builder().build().unwrap();
let url = String::from("SomeURL");
let res = client.get(&url).send().unwrap().text();
let mut v = json!(null);
match res {
Ok(n) => {
v = serde_json::from_str(&n).unwrap();
}
Err(r) => {
println!("Something wrong happened {:?}", r);
}
}
v
}
fn main() {
println!("Hi there! i want the function above to return a result instead of a Serde value so I can handle the error in main!");
}
Here is a link to a rust playground example
The official Rust book, The Rust Programming Language, is freely available online. It has an entire chapter on using Result, explaining introductory topics such as the Result enum and how to use it.
How to return a Result containing a serde_json::Value?
The same way you return a Result of any type; there's nothing special about Value:
use serde_json::json; // 1.0.38
pub fn ok_example() -> Result<serde_json::value::Value, i32> {
Ok(json! { "success" })
}
pub fn err_example() -> Result<serde_json::value::Value, i32> {
Err(42)
}
If you have a function that returns a Result, you can use the question mark operator (?) to exit early from a function on error, returning the error. This is a concise way to avoid unwrap or expect:
fn use_them() -> Result<(), i32> {
let ok = ok_example()?;
println!("{:?}", ok);
let err = err_example()?;
println!("{:?}", err); // Never executed, we always exit due to the `?`
Ok(()) // Never executed
}
This is just a basic example.
Applied to your MCVE, it would look something like:
use reqwest; // 0.9.10
use serde_json::Value; // 1.0.38
type Error = Box<dyn std::error::Error>;
pub fn perform_get(_id: String) -> Result<Value, Error> {
let client = reqwest::Client::builder().build()?;
let url = String::from("SomeURL");
let res = client.get(&url).send()?.text()?;
let v = serde_json::from_str(&res)?;
Ok(v)
}
Here, I'm using the trait object Box<dyn std::error::Error> to handle any kind of error (great for quick programs and examples). I then sprinkle ? on every method that could fail (i.e. returns a Result) and end the function with an explicit Ok for the final value.
Note that the panic and the never-used null value can be removed with this style.
See also:
What is this question mark operator about?
Rust proper error handling (auto convert from one error type to another with question mark)
Rust return result error from fn
Return value from match to Err(e)
What is the idiomatic way to handle/unwrap nested Result types?
better practice to return a Result
See also:
Should I avoid unwrap in production application?
If you are in the user side I would suggest to use Box<dyn std::error::Error>, this allow to return every type that implement Error, ? will convert the concrete error type to the dynamic boxed trait, this add a little overhead when there is an error but when error are not expected or really rare this is not a big deal.
use reqwest;
use serde_json::value::Value;
use std::error::Error;
fn perform_get(_id: String) -> Result<Value, Box<dyn Error>> {
let client = reqwest::Client::builder().build()?;
let url = String::from("SomeURL");
let res = client.get(&url).send()?.text()?;
let v = serde_json::from_str(&res)?;
Ok(v)
// last two line could be serde_json::from_str(&res).map_err(std::convert::Into::into)
}
fn main() {
println!("{:?}", perform_get("hello".to_string()));
}
This produce the following error:
Err(Error { kind: Url(RelativeUrlWithoutBase), url: None })
The kind smart folks over at Rust Discord helped me solve this one. (user noc)
extern crate base64;
extern crate reqwest;
pub fn get_jira_ticket() -> Result<serde_json::value::Value, reqwest::Error> {
let client = reqwest::Client::builder().build().unwrap();
let url = String::from("SomeURL");
let res = client.get(&url).send().and_then(|mut r| r.json());
res
}
fn main() {
println!("This works");
}
The key part was this in the header for the return
-> Result<serde_json::value::Value, reqwest::Error>
And this here to actually return the data.
client.get(&url).send().and_then(|mut r| r.json());
extern crate tokio; // 0.1.8
use tokio::prelude::*;
fn create_a_future(x: u8) -> Box<Future<Item = (), Error = ()>> {
Box::new(futures::future::ok(2).and_then(|a| {
println!("{}", a);
Ok(())
}))
}
fn main() {
let mut eloop = tokio_core::reactor::Core::new().unwrap();
let handle = eloop.handle();
for x in 0..10 {
let f = create_a_future(x);
handle.spawn(f);
}
}
I expect this to print to stdout, but it didn't happen. Am I using spawn in the wrong way?
As already mentioned in the comments, you are setting up a bunch of computation but never running any of it. Like iterators, you can think of futures as lazy. The compiler normally tells you about this when you directly create a future but never use it. Here, you are spawning the futures, so you don't get that warning, but nothing ever drives the Tokio reactor.
In many cases, you have a specific future you want to run, and you'd drive the reactor until that completes. In other cases, your run the reactor "forever", endlessly handling new work.
In this case, you can use Core::turn:
fn main() {
let mut eloop = tokio_core::reactor::Core::new().unwrap();
let handle = eloop.handle();
for x in 0..10 {
let f = create_a_future(x);
handle.spawn(f);
}
eloop.run(None);
}
eloop.turn(None);
-> Box<Future<Item = (), Error = ()>>
You don't need to (and probably shouldn't) do this in modern Rust. It's preferred to return an anonymous type:
fn create_a_future() -> impl Future<Item = (), Error = ()> {
futures::future::ok(2).and_then(|a| {
println!("{}", a);
Ok(())
})
}
tokio_core::reactor::Core
My understanding is that this level of Tokio is reserved for more complicated setups. Many people can just use tokio::run and tokio::spawn:
fn main() {
tokio::run(futures::lazy(|| {
for _ in 0..10 {
tokio::spawn(create_a_future());
}
Ok(())
}))
}
I'm trying to make a Stream that would wait until a specific character is in buffer. I know there's read_until() on BufRead but I actually need a custom solution, as this is a stepping stone to implement waiting until a specific string in in buffer (or, for example, a regexp match happens).
In my project where I first encountered the problem, problem was that future processing just hanged when I get a Ready(_) from inner future and return NotReady from my function. I discovered I shouldn't do that per docs (last paragraph). However, what I didn't get, is what's the actual alternative that is promised in that paragraph. I read all the published documentation on the Tokio site and it doesn't make sense for me at the moment.
So following is my current code. Unfortunately I couldn't make it simpler and smaller as it's already broken. Current result is this:
Err(Custom { kind: Other, error: Error(Shutdown) })
Err(Custom { kind: Other, error: Error(Shutdown) })
Err(Custom { kind: Other, error: Error(Shutdown) })
<ad infinum>
Expected result is getting some Ok(Ready(_)) out of it, while printing W and W', and waiting for specific character in buffer.
extern crate futures;
extern crate tokio_core;
extern crate tokio_io;
extern crate tokio_io_timeout;
extern crate tokio_process;
use futures::stream::poll_fn;
use futures::{Async, Poll, Stream};
use tokio_core::reactor::Core;
use tokio_io::AsyncRead;
use tokio_io_timeout::TimeoutReader;
use tokio_process::CommandExt;
use std::process::{Command, Stdio};
use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;
struct Process {
child: tokio_process::Child,
stdout: Arc<Mutex<tokio_io_timeout::TimeoutReader<tokio_process::ChildStdout>>>,
}
impl Process {
fn new(
command: &str,
reader_timeout: Option<Duration>,
core: &tokio_core::reactor::Core,
) -> Self {
let mut cmd = Command::new(command);
let cat = cmd.stdout(Stdio::piped());
let mut child = cat.spawn_async(&core.handle()).unwrap();
let stdout = child.stdout().take().unwrap();
let mut timeout_reader = TimeoutReader::new(stdout);
timeout_reader.set_timeout(reader_timeout);
let timeout_reader = Arc::new(Mutex::new(timeout_reader));
Self {
child,
stdout: timeout_reader,
}
}
}
fn work() -> Result<(), ()> {
let window = Arc::new(Mutex::new(Vec::new()));
let mut core = Core::new().unwrap();
let process = Process::new("cat", Some(Duration::from_secs(20)), &core);
let mark = Arc::new(Mutex::new(b'c'));
let read_until_stream = poll_fn({
let window = window.clone();
let timeout_reader = process.stdout.clone();
move || -> Poll<Option<u8>, std::io::Error> {
let mut buf = [0; 8];
let poll;
{
let mut timeout_reader = timeout_reader.lock().unwrap();
poll = timeout_reader.poll_read(&mut buf);
}
match poll {
Ok(Async::Ready(0)) => Ok(Async::Ready(None)),
Ok(Async::Ready(x)) => {
{
let mut window = window.lock().unwrap();
println!("W: {:?}", *window);
println!("buf: {:?}", &buf[0..x]);
window.extend(buf[0..x].into_iter().map(|x| *x));
println!("W': {:?}", *window);
if let Some(_) = window.iter().find(|c| **c == *mark.lock().unwrap()) {
Ok(Async::Ready(Some(1)))
} else {
Ok(Async::NotReady)
}
}
}
Ok(Async::NotReady) => Ok(Async::NotReady),
Err(e) => Err(e),
}
}
});
let _stream_thread = thread::spawn(move || {
for o in read_until_stream.wait() {
println!("{:?}", o);
}
});
match core.run(process.child) {
Ok(_) => {}
Err(e) => {
println!("Child error: {:?}", e);
}
}
Ok(())
}
fn main() {
work().unwrap();
}
This is complete example project.
If you need more data you need to call poll_read again until you either find what you were looking for or poll_read returns NotReady.
You might want to avoid looping in one task for too long, so you can build yourself a yield_task function to call instead if poll_read didn't return NotReady; it makes sure your task gets called again ASAP after other pending tasks were run.
To use it just run return yield_task();.
fn yield_inner() {
use futures::task;
task::current().notify();
}
#[inline(always)]
pub fn yield_task<T, E>() -> Poll<T, E> {
yield_inner();
Ok(Async::NotReady)
}
Also see futures-rs#354: Handle long-running, always-ready futures fairly #354.
With the new async/await API futures::task::current is gone; instead you'll need a std::task::Context reference, which is provided as parameter to the new std::future::Future::poll trait method.
If you're already manually implementing the std::future::Future trait you can simply insert:
context.waker().wake_by_ref();
return std::task::Poll::Pending;
Or build yourself a Future-implementing type that yields exactly once:
pub struct Yield {
ready: bool,
}
impl core::future::Future for Yield {
type Output = ();
fn poll(self: core::pin::Pin<&mut Self>, cx: &mut core::task::Context<'_>) -> core::task::Poll<Self::Output> {
let this = self.get_mut();
if this.ready {
core::task::Poll::Ready(())
} else {
cx.waker().wake_by_ref();
this.ready = true; // ready next round
core::task::Poll::Pending
}
}
}
pub fn yield_task() -> Yield {
Yield { ready: false }
}
And then use it in async code like this:
yield_task().await;
I am a beginner in Rust.
I have a long running IO-bound process that I want to spawn and monitor via a REST API. I chose Iron for that, following this tutorial . Monitoring means getting its progress and its final result.
When I spawn it, I give it an id and map that id to a resource that I can GET to get the progress. I don't have to be exact with the progress; I can report the progress from 5 seconds ago.
My first attempt was to have a channel via which I send request for progress and receive the status. I got stuck where to store the receiver, as in my understanding it belongs to one thread only. I wanted to put it in the context of the request, but that won't work as there are different threads handling subsequent requests.
What would be the idiomatic way to do this in Rust?
I have a sample project.
Later edit:
Here is a self contained example which follows the sample principle as the answer, namely a map where each thread updates its progress:
extern crate iron;
extern crate router;
extern crate rustc_serialize;
use iron::prelude::*;
use iron::status;
use router::Router;
use rustc_serialize::json;
use std::io::Read;
use std::sync::{Mutex, Arc};
use std::thread;
use std::time::Duration;
use std::collections::HashMap;
#[derive(Debug, Clone, RustcEncodable, RustcDecodable)]
pub struct Status {
pub progress: u64,
pub context: String
}
#[derive(RustcEncodable, RustcDecodable)]
struct StartTask {
id: u64
}
fn start_process(status: Arc<Mutex<HashMap<u64, Status>>>, task_id: u64) {
let c = status.clone();
thread::spawn(move || {
for i in 1..100 {
{
let m = &mut c.lock().unwrap();
m.insert(task_id, Status{ progress: i, context: "in progress".to_string()});
}
thread::sleep(Duration::from_secs(1));
}
let m = &mut c.lock().unwrap();
m.insert(task_id, Status{ progress: 100, context: "done".to_string()});
});
}
fn main() {
let status: Arc<Mutex<HashMap<u64, Status>>> = Arc::new(Mutex::new(HashMap::new()));
let status_clone: Arc<Mutex<HashMap<u64, Status>>> = status.clone();
let mut router = Router::new();
router.get("/:taskId", move |r: &mut Request| task_status(r, &status.lock().unwrap()));
router.post("/start", move |r: &mut Request|
start_task(r, status_clone.clone()));
fn task_status(req: &mut Request, statuses: & HashMap<u64,Status>) -> IronResult<Response> {
let ref task_id = req.extensions.get::<Router>().unwrap().find("taskId").unwrap_or("/").parse::<u64>().unwrap();
let payload = json::encode(&statuses.get(&task_id)).unwrap();
Ok(Response::with((status::Ok, payload)))
}
// Receive a message by POST and play it back.
fn start_task(request: &mut Request, statuses: Arc<Mutex<HashMap<u64, Status>>>) -> IronResult<Response> {
let mut payload = String::new();
request.body.read_to_string(&mut payload).unwrap();
let task_start_request: StartTask = json::decode(&payload).unwrap();
start_process(statuses, task_start_request.id);
Ok(Response::with((status::Ok, json::encode(&task_start_request).unwrap())))
}
Iron::new(router).http("localhost:3000").unwrap();
}
One possibility is to use a global HashMap that associate each worker id with the progress (and result). Here is simple example (without the rest stuff)
#[macro_use]
extern crate lazy_static;
use std::sync::Mutex;
use std::collections::HashMap;
use std::thread;
use std::time::Duration;
lazy_static! {
static ref PROGRESS: Mutex<HashMap<usize, usize>> = Mutex::new(HashMap::new());
}
fn set_progress(id: usize, progress: usize) {
// insert replaces the old value if there was one.
PROGRESS.lock().unwrap().insert(id, progress);
}
fn get_progress(id: usize) -> Option<usize> {
PROGRESS.lock().unwrap().get(&id).cloned()
}
fn work(id: usize) {
println!("Creating {}", id);
set_progress(id, 0);
for i in 0..100 {
set_progress(id, i + 1);
// simulates work
thread::sleep(Duration::new(0, 50_000_000));
}
}
fn monitor(id: usize) {
loop {
if let Some(p) = get_progress(id) {
if p == 100 {
println!("Done {}", id);
// to avoid leaks, remove id from PROGRESS.
// maybe save that the task ends in a data base.
return
} else {
println!("Progress {}: {}", id, p);
}
}
thread::sleep(Duration::new(1, 0));
}
}
fn main() {
let w = thread::spawn(|| work(1));
let m = thread::spawn(|| monitor(1));
w.join().unwrap();
m.join().unwrap();
}
You need to register one channel per request thread, because if cloning Receivers were possible the responses might/will end up with the wrong thread if two request are running at the same time.
Instead of having your thread create a channel for answering requests, use a future. A future allows you to have a handle to an object, where the object doesn't exist yet. You can change the input channel to receive a Promise, which you then fulfill, no output channel necessary.