How do I start a web server in Rust with hyper? - rust

I want to learn Rust by writing a reverse proxy with the hyper framework. My complete project is on GitHub. I'm stuck at starting a listener as explained in the documentation:
extern crate hyper;
use hyper::Client;
use hyper::server::{Server, Request, Response};
use std::io::Read;
fn pipe_through(req: Request, res: Response) {
let client = Client::new();
// Why does the response have to be mutable here? We never need to modify it, so we should be
// able to remove "mut"?
let mut response = client.get("http://drupal-8.localhost/").send().unwrap();
// Print out all the headers first.
for header in response.headers.iter() {
println!("{}", header);
}
// Now the body. This is ugly, why do I have to create an intermediary string variable? I want
// to push the response directly to stdout.
let mut body = String::new();
response.read_to_string(&mut body).unwrap();
print!("{}", body);
}
Server::http("127.0.0.1:9090").unwrap().handle(pipe_through).unwrap();
That does not work and fails with the following compile error:
error: expected one of `!` or `::`, found `(`
--> src/main.rs:23:13
|
23 | Server::http("127.0.0.1:9090").unwrap().handle(pipe_through).unwrap();
| ^
Why is my call to http() not correct? Shouldn't it create a new server as indicated in the documentation?

All expressions in Rust must be inside a function, so I need to start my server in fn main(). Then it works!

Related

How to convert hyper's Body stream into a Result<Vec<String>>?

I'm updating code to the newest versions of hyper and futures, but everything I've tried misses implemented traits in some kind or another.
A not working example playground for this ...
extern crate futures; // 0.3.5
extern crate hyper; // 0.13.6
use futures::{future, FutureExt, StreamExt, TryFutureExt, TryStreamExt};
use hyper::body;
fn get_body_as_vec<'a>(b: body::Body) -> future::BoxFuture<'a, Result<Vec<String>, hyper::Error>> {
let f = b.and_then(|bytes| {
let s = std::str::from_utf8(&bytes).expect("sends no utf-8");
let mut lines: Vec<String> = Vec::new();
for l in s.lines() {
lines.push(l.to_string());
}
future::ok(lines)
});
Box::pin(f)
}
This produces the error:
error[E0277]: the trait bound `futures::stream::AndThen<hyper::Body, futures::future::Ready<std::result::Result<std::vec::Vec<std::string::String>, hyper::Error>>, [closure#src/lib.rs:8:24: 15:6]>: futures::Future` is not satisfied
--> src/lib.rs:17:5
|
17 | Box::pin(f)
| ^^^^^^^^^^^ the trait `futures::Future` is not implemented for `futures::stream::AndThen<hyper::Body, futures::future::Ready<std::result::Result<std::vec::Vec<std::string::String>, hyper::Error>>, [closure#src/lib.rs:8:24: 15:6]>`
|
= note: required for the cast to the object type `dyn futures::Future<Output = std::result::Result<std::vec::Vec<std::string::String>, hyper::Error>> + std::marker::Send`
I'm unable to create a compatible future. Body is a stream and I can't find any "converter" function with the required traits implemented.
With hyper 0.12, I used concat2().
From the reference of and_then:
Note that this function consumes the receiving stream and returns a
wrapped version of it.
To process the entire stream and return a single future representing
success or error, use try_for_each instead.
Yes your f is still a Stream, try_for_each will work as reference suggested but try_fold would be a better choice to represent bytes as lines in vector but as #Shepmaster points in the comment; there is a possibility that if we directly convert chunks to the UTF-8 we can lose integrity of multi-byte characters from response.
Due to consistency of data, the easiest solution might be collecting all the bytes before conversion to UTF-8.
use futures::{future, FutureExt, TryStreamExt};
use hyper::body;
fn get_body_as_vec<'a>(b: body::Body) -> future::BoxFuture<'a, Result<Vec<String>>> {
let f = b
.try_fold(vec![], |mut vec, bytes| {
vec.extend_from_slice(&bytes);
future::ok(vec)
})
.map(|x| {
Ok(std::str::from_utf8(&x?)?
.lines()
.map(ToString::to_string)
.collect())
});
Box::pin(f)
}
Playground
You can test the multiple chunk behavior by using channel from hyper Body. Here is I've created the line partition across the chunks scenario, this will work fine with the code above but if you directly process the chunks you will lose the consistency.
let (mut sender, body) = body::Body::channel();
tokio::spawn(async move {
sender
.send_data("Line1\nLine2\nLine3\nLine4\nLine5".into())
.await;
sender
.send_data("next bytes of Line5\nLine6\nLine7\nLine8\n----".into())
.await;
});
println!("{:?}", get_body_as_vec(body).await);
Playground ( Success scenario )
Playground ( Fail scenario: "next bytes of Line5" will be
represented as new line in Vec)
Note : I've used std::error:Error as a return type since both hyper::Error and FromUtf8Error implement it, you may still use your expect strategy with hyper::Error.
I found two solutions, each of them is pretty simple:
/*
WARNING for beginners!!! This use statement
is important so we can later use .data() method!!!
*/
use hyper::body::{to_bytes, HttpBody};
// Takes only single chunk of data!
let my_vector: Vec<u8> = request.into_body().data().await.unwrap().unwrap().to_vec();
// Takes all data chunks, not just the first one:
let my_bytest = body::to_bytes(res.into_body()).await?;
let my_string = String::from_utf8(my_vector).unwrap();
This example doesn't handle errors properly, ensure your code does.

Where does a variable passed to Reqwest's Result::read_to_string get the data from?

I am learning Rust and have been playing around with this example to perform an HTTP GET request and then display the data:
extern crate reqwest;
use std::io::Read;
fn run() -> Result<()> {
let mut res = reqwest::get("http://httpbin.org/get")?;
let mut body = String::new();
res.read_to_string(&mut body)?;
println!("Status: {}", res.status());
println!("Headers:\n{:#?}", res.headers());
println!("Body:\n{}", body);
Ok(())
}
I cannot understand how the variable body is actually ending up with the correct data. For headers and status, I can see the associated functions but for the body data it just uses read_to_string for the whole data?
The res object has a read_to_string() method which stores the response into the String that you pass it in
res.read_to_string(&mut body);
Edit: imported from my comment:
reqwest::Response 0.6.2 documentation states for Read for Response:
Read the body of the Response
which somehow seems missing from the documentation of the current version.

How do I access HttpRequest data inside a Future in Actix-web?

I'd like to have an Actix Web handler which responds to a POST request by printing the POST body to the console and constructing an HTTP response that contains the current URL from the request object.
When reading the request's POST body, futures seem to need to be involved. The closest I've gotten so far is:
fn handler(req: HttpRequest) -> FutureResponse<HttpResponse> {
req.body()
.from_err()
.and_then(|bytes: Bytes| {
println!("Body: {:?}", bytes);
let url = format!("{scheme}://{host}",
scheme = req.connection_info().scheme(),
host = req.connection_info().host());
Ok(HttpResponse::Ok().body(url).into())
}).responder()
}
This won't compile because the future outlives the handler, so my attempts to read req.connection_info() are illegal. The compiler error suggests I use the move keyword into the closure definition, i.e. .and_then(move |bytes: Bytes| {. This also won't compile because req gets moved on the req.body() call and is then captured after the move in the references constructing url.
What is a reasonable way of constructing a scope in which I have access to data attached to the request object (e.g. the connection_info) at the same time as access to the POST body?
The simplest solution is to not access it inside the future at all:
extern crate actix_web; // 0.6.14
extern crate bytes; // 0.4.8
extern crate futures; // 0.1.21
use actix_web::{AsyncResponder, FutureResponse, HttpMessage, HttpRequest, HttpResponse};
use bytes::Bytes;
use futures::future::Future;
fn handler(req: HttpRequest) -> FutureResponse<HttpResponse> {
let url = format!(
"{scheme}://{host}",
scheme = req.connection_info().scheme(),
host = req.connection_info().host(),
);
req.body()
.from_err()
.and_then(move |bytes: Bytes| {
println!("Body: {:?}", bytes);
Ok(HttpResponse::Ok().body(url).into())
})
.responder()
}
In case this is more than a quick hack for demonstration purposes, constructing URLs by concatenating strings is a terrible idea as it doesn't properly escape the values. You should be using a type that does that for you.

How to download a large file with hyper and resume on error?

I want to download large files (500mb) with hyper, and be able to resume if the download fails.
Is there any way with hyper to run some function for each chunk of data received? The send() method returns a Result<Response>, but I can't find any methods on Response that return an iterator over chunks. Ideally I'd be able to do something like:
client.get(&url.to_string())
.send()
.map(|mut res| {
let mut chunk = String::new();
// write this chunk to disk
});
Is this possible, or will map only be called once hyper has downloaded the entire file?
Is there any way with hyper to run some function for each chunk of data received?
Hyper's Response implements Read. It means that Response is a stream and you can read arbitrary chunks of data from it as you would usually do with a stream.
For what it's worth, here's a piece of code I use to download large files from ICECat. I'm using the Read interface in order to display the download progress in the terminal.
The variable response here is an instance of Hyper's Response.
{
let mut file = try_s!(fs::File::create(&tmp_path));
let mut deflate = try_s!(GzDecoder::new(response));
let mut buf = [0; 128 * 1024];
let mut written = 0;
loop {
status_line! ("icecat_fetch] " (url) ": " (written / 1024 / 1024) " MiB.");
let len = match deflate.read(&mut buf) {
Ok(0) => break, // EOF.
Ok(len) => len,
Err(ref err) if err.kind() == io::ErrorKind::Interrupted => continue,
Err(err) => return ERR!("{}: Download failed: {}", url, err),
};
try_s!(file.write_all(&buf[..len]));
written += len;
}
}
try_s!(fs::rename(tmp_path, target_path));
status_line_clear();
I want to download large files (500mb) with hyper, and be able to resume if the download fails.
This is usually implemented with the HTTP "Range" header (cf. RFC 7233).
Not every server out there supports the "Range" header. I've seen a lot of servers with a custom HTTP stack and without the proper "Range" support, or with the "Range" header disabled for some reason. So skipping the Hyper's Response chunks might be a necessary fallback.
But if you want to speed things up and save traffic then the primary means of resuming a stopped download should be by using the "Range" header.
P.S. With Hyper 0.12 the response body returned by the Hyper is a Stream and to run some function for each chunk of data received we can use the for_each stream combinator:
extern crate futures;
extern crate futures_cpupool;
extern crate hyper; // 0.12
extern crate hyper_rustls;
use futures::Future;
use futures_cpupool::CpuPool;
use hyper::rt::Stream;
use hyper::{Body, Client, Request};
use hyper_rustls::HttpsConnector;
use std::thread;
use std::time::Duration;
fn main() {
let url = "https://steemitimages.com/DQmYWcEumaw1ajSge5PcGpgPpXydTkTcqe1daF4Ro3sRLDi/IMG_20130103_103123.jpg";
// In real life we'd want an asynchronous reactor, such as the tokio_core, but for a short example the `CpuPool` should do.
let pool = CpuPool::new(1);
let https = HttpsConnector::new(1);
let client = Client::builder().executor(pool.clone()).build(https);
// `unwrap` is used because there are different ways (and/or libraries) to handle the errors and you should pick one yourself.
// Also to keep this example simple.
let req = Request::builder().uri(url).body(Body::empty()).unwrap();
let fut = client.request(req);
// Rebinding (shadowing) the `fut` variable allows us (in smart IDEs) to more easily examine the gradual weaving of the types.
let fut = fut.then(move |res| {
let res = res.unwrap();
println!("Status: {:?}.", res.status());
let body = res.into_body();
// `for_each` returns a `Future` that we must embed into our chain of futures in order to execute it.
body.for_each(move |chunk| {println!("Got a chunk of {} bytes.", chunk.len()); Ok(())})
});
// Handle the errors: we need error-free futures for `spawn`.
let fut = fut.then(move |r| -> Result<(), ()> {r.unwrap(); Ok(())});
// Spawning the future onto a runtime starts executing it in background.
// If not spawned onto a runtime the future will be executed in `wait`.
//
// Note that we should keep the future around.
// To save resources most implementations would *cancel* the dropped futures.
let _fut = pool.spawn(fut);
thread::sleep (Duration::from_secs (1)); // or `_fut.wait()`.
}

tokio-curl: capture output into a local `Vec` - may outlive borrowed value

I do not know Rust well enough to understand lifetimes and closures yet...
Trying to collect the downloaded data into a vector using tokio-curl:
extern crate curl;
extern crate futures;
extern crate tokio_core;
extern crate tokio_curl;
use std::io::{self, Write};
use std::str;
use curl::easy::Easy;
use tokio_core::reactor::Core;
use tokio_curl::Session;
fn main() {
// Create an event loop that we'll run on, as well as an HTTP `Session`
// which we'll be routing all requests through.
let mut lp = Core::new().unwrap();
let mut out = Vec::new();
let session = Session::new(lp.handle());
// Prepare the HTTP request to be sent.
let mut req = Easy::new();
req.get(true).unwrap();
req.url("https://www.rust-lang.org").unwrap();
req.write_function(|data| {
out.extend_from_slice(data);
io::stdout().write_all(data).unwrap();
Ok(data.len())
})
.unwrap();
// Once we've got our session, issue an HTTP request to download the
// rust-lang home page
let request = session.perform(req);
// Execute the request, and print the response code as well as the error
// that happened (if any).
let mut req = lp.run(request).unwrap();
println!("{:?}", req.response_code());
println!("out: {}", str::from_utf8(&out).unwrap());
}
Produces an error:
error[E0373]: closure may outlive the current function, but it borrows `out`, which is owned by the current function
--> src/main.rs:25:24
|
25 | req.write_function(|data| {
| ^^^^^^ may outlive borrowed value `out`
26 | out.extend_from_slice(data);
| --- `out` is borrowed here
|
help: to force the closure to take ownership of `out` (and any other referenced variables), use the `move` keyword, as shown:
| req.write_function(move |data| {
Investigating further, I see that Easy::write_function requires the 'static lifetime, but the example of how to collect output from the curl-rust docs uses Transfer::write_function instead:
use curl::easy::Easy;
let mut data = Vec::new();
let mut handle = Easy::new();
handle.url("https://www.rust-lang.org/").unwrap();
{
let mut transfer = handle.transfer();
transfer.write_function(|new_data| {
data.extend_from_slice(new_data);
Ok(new_data.len())
}).unwrap();
transfer.perform().unwrap();
}
println!("{:?}", data);
The Transfer::write_function does not require the 'static lifetime:
impl<'easy, 'data> Transfer<'easy, 'data> {
/// Same as `Easy::write_function`, just takes a non `'static` lifetime
/// corresponding to the lifetime of this transfer.
pub fn write_function<F>(&mut self, f: F) -> Result<(), Error>
where F: FnMut(&[u8]) -> Result<usize, WriteError> + 'data
{
...
But I can't use a Transfer instance on tokio-curl's Session::perform because it requires the Easy type:
pub fn perform(&self, handle: Easy) -> Perform {
transfer.easy is a private field that is directly passed to session.perform.
It this an issue with tokio-curl? Maybe it should mark the transfer.easy field as public or implement new function like perform_transfer? Is there another way to collect output using tokio-curl per transfer?
The first thing you have to understand when using the futures library is that you don't have any control over what thread the code is going to run on.
In addition, the documentation for curl's Easy::write_function says:
Note that the lifetime bound on this function is 'static, but that is often too restrictive. To use stack data consider calling the transfer method and then using write_function to configure a callback that can reference stack-local data.
The most straight-forward solution is to use some type of locking primitive to ensure that only one thread at a time may have access to the vector. You also have to share ownership of the vector between the main thread and the closure:
use std::sync::Mutex;
use std::sync::Arc;
let out = Arc::new(Mutex::new(Vec::new()));
let out_closure = out.clone();
// ...
req.write_function(move |data| {
let mut out = out_closure.lock().expect("Unable to lock output");
// ...
}).expect("Cannot set writing function");
// ...
let out = out.lock().expect("Unable to lock output");
println!("out: {}", str::from_utf8(&out).expect("Data was not UTF-8"));
Unfortunately, the tokio-curl library does not currently support using the Transfer type that would allow for stack-based data.

Resources