Rust multithreaded HTTP requests, get all data from response? - multithreading

So I have the following code that sends multithreaded requests to a list of domains. So far I am able to grab individual data from the response, such as the bytes, or the url, or the status code, etc. but not all of them together. I would like to store all of these values into vectors so it can be written to a file. I know this is probably a super dumb question, but i've been working on it for days and can't figure it out. Any help is appreciated!
#[tokio::main]
async fn get_request(urls: Vec<&str>, paths: Vec<&str>) {
let client = Client::new();
for path in paths {
let urls = urls.clone();
let bodies = stream::iter(urls)
.map(|url| {
let client = &client;
async move {
let mut full_url = String::new();
full_url.push_str(url);
full_url.push_str(path);
let resp = client
.get(&full_url)
.timeout(Duration::from_secs(3))
.send()
.await;
resp
}
})
.buffer_unordered(PARALLEL_REQUESTS);
bodies
.for_each(|b| async {
match b {
Ok(b) => {
let mut body_test = &b.bytes().await;
match body_test {
Ok(body_test) => {
let mut stringed_bytes = str::from_utf8(&body_test);
match stringed_bytes {
Ok(stringed_bytes) => {
println!("stringed bytes: {}", stringed_bytes);
}
Err(e) => println!("Stringified Error: {}", e),
}
}
Err(e) => println!("body Error: {}", e),
}
}
Err(e) => println!("Got an error: {}", e),
}
})
.await;
}
}

I would try creating a struct that would store your desired outputs that you would like to acquire from the body or the hyper::Response. For example:
struct DesiredContent {
code: Option<u16>,
body: Json
...
}
And then implementing impl TryFrom<hyper::Response> for DesiredContent
(This assumes you might have cases that should fail like - 404 errors)

Related

What lifetimes and bounds are needed to generalize this async code? [duplicate]

This question already has answers here:
How to fix lifetime error when function returns a serde Deserialize type?
(2 answers)
Why do Rust lifetimes matter when I move values into a spawned Tokio task?
(1 answer)
Closed 1 year ago.
I have this websocket code that uses tokio and serde here:
use async_once::AsyncOnce;
use common_wasm::models::status::{CommandMessage, StatusMessage};
use futures_util::{SinkExt, StreamExt};
use lazy_static::lazy_static;
use std::{collections::VecDeque, net::SocketAddr};
use tokio::{
net::{TcpListener, TcpStream}, sync::{broadcast, mpsc}
};
use tokio_tungstenite::{
accept_async, tungstenite::{Error, Message, Result}
};
use tracing::*;
// https://stackoverflow.com/questions/67650879/rust-lazy-static-with-async-await
lazy_static! {
pub static ref STATUS_REPORTER: AsyncOnce<StatusWs> = AsyncOnce::new(async {
info!("Init lazy static WS");
let server = StatusWs::init("ws://localhost:44444").await;
server
});
}
use StatusMessage as SenderType;
use CommandMessage as ReceiveType;
pub struct StatusWs {
buf: VecDeque<ReceiveType>,
rx_client_msg: mpsc::Receiver<ReceiveType>,
tx_server_msg: broadcast::Sender<SenderType>,
}
impl StatusWs {
pub async fn init(addr: &str) -> StatusWs {
info!("Init Status WS on {}", addr);
let listener = TcpListener::bind(&addr).await.expect("Can't listen");
// Clients producting to server, they use the tx to send and server uses the rx to read
let (tx_client_msg, rx_client_msg) = mpsc::channel::<ReceiveType>(32);
// spmc for server to broadcast status to listeners. Server uses tx to send and client uses rx to read
let (tx_server_msg, _rx_server_msg) = broadcast::channel::<SenderType>(10);
let tx_server_2 = tx_server_msg.clone();
tokio::spawn(async move {
while let Ok((stream, peer)) = listener.accept().await {
info!("Peer address connected: {}", peer);
let tx_client = tx_client_msg.clone();
let rx_server = tx_server_msg.subscribe();
tokio::spawn(async move {
accept_connection(peer, stream, tx_client, rx_server).await;
});
}
});
StatusWs { buf: VecDeque::new(), rx_client_msg, tx_server_msg: tx_server_2 }
}
pub async fn reportinfo(&self, msg: &SenderType) {
let my_msg = msg.clone();
match &self.tx_server_msg.send(my_msg) {
Ok(_size) => {
//trace!("Server Sending OK {}", size)
},
Err(_err) => {
//trace!("Server Sending ERR {:?}", err)
},
}
}
pub async fn next(&mut self) -> Result<Option<ReceiveType>> {
loop {
// If buffer contains data, we can directly return it.
if let Some(data) = self.buf.pop_front() {
return Ok(Some(data));
}
// Fetch new response if buffer is empty.
let response = self.next_response().await?;
// Handle the response, possibly adding to the buffer
self.handle_response(response)?;
}
}
async fn next_response(&mut self) -> Result<ReceiveType> {
loop {
tokio::select! { // TODO don't need select if there's only one thing?
Some(msg) = self.rx_client_msg.recv() => {
return Ok(msg)
},
}
}
}
fn handle_response(&mut self, response: ReceiveType) -> Result<()> {
self.buf.push_back(response);
Ok(())
}
}
async fn accept_connection(peer: SocketAddr, stream: TcpStream, tx_client: mpsc::Sender<ReceiveType>, rx_server: broadcast::Receiver<SenderType>) {
info!("Accepting connection from {}", peer);
if let Err(e) = handle_connection(peer, stream, tx_client, rx_server).await {
match e {
Error::ConnectionClosed | Error::Protocol(_) | Error::Utf8 => error!("Connection closed"),
err => error!("Error processing connection: {}", err),
}
}
}
async fn handle_connection(
_peer: SocketAddr, stream: TcpStream, tx_client: mpsc::Sender<ReceiveType>, mut rx_server: broadcast::Receiver<SenderType>,
) -> Result<()> {
let ws_stream = accept_async(stream).await.expect("Failed to accept");
let (mut ws_sender, mut ws_receiver) = ws_stream.split();
loop {
tokio::select! {
remote_msg = ws_receiver.next() => {
match remote_msg {
Some(msg) => {
let msg = msg?;
match msg {
Message::Text(resptxt) => {
match serde_json::from_str::<ReceiveType>(&resptxt) {
Ok(cmd) => { let _ = tx_client.send(cmd).await; },
Err(err) => error!("Error deserializing: {}", err),
}
},
Message::Close(_) => break,
_ => { },
}
}
None => break,
}
}
Ok(msg) = rx_server.recv() => {
match serde_json::to_string(&msg) {
Ok(txt) => ws_sender.send(Message::Text(txt)).await?,
Err(_) => todo!(),
}
}
}
}
Ok(())
}
The sender and receiver types are simple (simple types all the way down):
use std::{collections::BTreeMap, fmt::Debug};
use serde::{Deserialize, Serialize};
#[derive(Default, Clone, Debug, Serialize, Deserialize)]
pub struct StatusMessage {
pub name: String,
pub entries: BTreeMap<i32, GuiEntry>,
}
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct CommandMessage {
pub sender: String,
pub entryid: i32,
pub command: GuiValue,
}
Now I want to generalize the code so that I can create a struct that takes some other kind of Sender and Receiver type. Yes, I could just change the aliases, but I want to be able to use the generic type arguments rather than duplicate the whole file. The problem is as I follow the suggestions from the compiler, I end up in a place where I don't know what to do next. It's telling me resptext does not live long enough:
`resptxt` does not live long enough
borrowed value does not live long enoughrust cE0597
status_ws.rs(133, 29): `resptxt` dropped here while still borrowed
status_ws.rs(115, 28): lifetime `'a` defined here
status_ws.rs(129, 39): argument requires that `resptxt` is borrowed for `'a`
Here's what I have thus far:
use async_once::AsyncOnce;
use common_wasm::models::status::{CommandMessage, StatusMessage};
use futures_util::{SinkExt, StreamExt};
use lazy_static::lazy_static;
use serde::{Serialize, Deserialize};
use std::{collections::VecDeque, net::SocketAddr};
use tokio::{
net::{TcpListener, TcpStream}, sync::{broadcast, mpsc}
};
use tokio_tungstenite::{
accept_async, tungstenite::{Error, Message, Result}
};
use tracing::*;
// https://stackoverflow.com/questions/67650879/rust-lazy-static-with-async-await
lazy_static! {
pub static ref STATUS_REPORTER: AsyncOnce<StatusWs<CommandMessage, StatusMessage>> = AsyncOnce::new(async {
info!("Init lazy static WS");
let server = StatusWs::init("ws://localhost:44444").await;
server
});
}
// use StatusMessage as SenderType;
// use CommandMessage as ReceiveType;
pub struct StatusWs<ReceiveType, SenderType> {
buf: VecDeque<ReceiveType>,
rx_client_msg: mpsc::Receiver<ReceiveType>,
tx_server_msg: broadcast::Sender<SenderType>,
}
impl <'a, ReceiveType: Deserialize<'a> + Send, SenderType: Serialize + Clone + Send + Sync> StatusWs <ReceiveType, SenderType> {
pub async fn init(addr: &str) -> StatusWs<ReceiveType, SenderType> {
info!("Init Status WS on {}", addr);
let listener = TcpListener::bind(&addr).await.expect("Can't listen");
// Clients producting to server, they use the tx to send and server uses the rx to read
let (tx_client_msg, rx_client_msg) = mpsc::channel::<ReceiveType>(32);
// spmc for server to broadcast status to listeners. Server uses tx to send and client uses rx to read
let (tx_server_msg, _rx_server_msg) = broadcast::channel::<SenderType>(10);
let tx_server_2 = tx_server_msg.clone();
tokio::spawn(async move {
while let Ok((stream, peer)) = listener.accept().await {
info!("Peer address connected: {}", peer);
let tx_client = tx_client_msg.clone();
let rx_server = tx_server_msg.subscribe();
tokio::spawn(async move {
accept_connection(peer, stream, tx_client, rx_server).await;
});
}
});
StatusWs { buf: VecDeque::new(), rx_client_msg, tx_server_msg: tx_server_2 }
}
pub async fn reportinfo(&self, msg: &SenderType) {
let my_msg = msg.clone();
match &self.tx_server_msg.send(my_msg) {
Ok(_size) => {
//trace!("Server Sending OK {}", size)
},
Err(_err) => {
//trace!("Server Sending ERR {:?}", err)
},
}
}
pub async fn next(&mut self) -> Result<Option<ReceiveType>> {
loop {
// If buffer contains data, we can directly return it.
if let Some(data) = self.buf.pop_front() {
return Ok(Some(data));
}
// Fetch new response if buffer is empty.
let response = self.next_response().await?;
// Handle the response, possibly adding to the buffer
self.handle_response(response)?;
}
}
async fn next_response(&mut self) -> Result<ReceiveType> {
loop {
tokio::select! { // TODO don't need select if there's only one thing?
Some(msg) = self.rx_client_msg.recv() => {
return Ok(msg)
},
}
}
}
fn handle_response(&mut self, response: ReceiveType) -> Result<()> {
self.buf.push_back(response);
Ok(())
}
}
async fn accept_connection<'a, ReceiveType: Deserialize<'a>, SenderType: Clone + Serialize>(peer: SocketAddr, stream: TcpStream, tx_client: mpsc::Sender<ReceiveType>, rx_server: broadcast::Receiver<SenderType>) {
info!("Accepting connection from {}", peer);
if let Err(e) = handle_connection(peer, stream, tx_client, rx_server).await {
match e {
Error::ConnectionClosed | Error::Protocol(_) | Error::Utf8 => error!("Connection closed"),
err => error!("Error processing connection: {}", err),
}
}
}
async fn handle_connection<'a, ReceiveType: Deserialize<'a>, SenderType: Clone + Serialize>(
_peer: SocketAddr, stream: TcpStream, tx_client: mpsc::Sender<ReceiveType>, mut rx_server: broadcast::Receiver<SenderType>,
) -> Result<()> {
let ws_stream = accept_async(stream).await.expect("Failed to accept");
let (mut ws_sender, mut ws_receiver) = ws_stream.split();
loop {
tokio::select! {
remote_msg = ws_receiver.next() => {
match remote_msg {
Some(msg) => {
let msg = msg?;
match msg {
Message::Text(resptxt) => {
match serde_json::from_str::<ReceiveType>(&resptxt) {
Ok(cmd) => { let _ = tx_client.send(cmd).await; },
Err(err) => error!("Error deserializing: {}", err),
}
},
Message::Close(_) => break,
_ => { },
}
}
None => break,
}
}
Ok(msg) = rx_server.recv() => {
match serde_json::to_string(&msg) {
Ok(txt) => ws_sender.send(Message::Text(txt)).await?,
Err(_) => todo!(),
}
}
}
}
Ok(())
}
I think there's some confusion about the necessary lifetimes and bounds, in particular the lifetime on the Deserializer from Serde and the Send/Sync auto trait markers on the message types.
In any case, it seems a bit brute force to just copy the whole original file and change out the aliases, which would definitely work, when it seems there's some sort of useful lesson here.
You should use serde::de::DeserializeOwned instead of Deserialize<'a>.
The Deserialize trait takes a lifetime parameter to support zero-cost deserialization, but you can't take advantage of that since the source, resptxt, is a transient value that isn't persisted anywhere. The DeserializeOwned trait can be used to constrain that the deserialized type does not keep references to the source and can therefore be used beyond it.
After fixing that, you'll get errors that ReceiveType and SenderType must be 'static to be used in a tokio::spawn'd task. Adding that constraint finally makes your code compile.
See the full compiling code on the playground for brevity.

Setting up a poll watcher using notify example

The notify documentation is great for raw and debounced events. There is mention of a poll watcher implementation but I cannot seem to set this up. Would anyone be able to point me or provide some sample code? Either for a raw or debounced method?
I have a mount drive I am trying to watch through Linux and the inotify instance is picking up no events so my only option is to try to poll.
Heres my current code for the debounced
extern crate notify;
extern crate weighted_rs;
use notify::{Watcher, RecursiveMode, watcher};
use std::sync::mpsc::channel;
use std::time::Duration;
use std::path::Path;
use std::fs;
use weighted_rs::{SmoothWeight, Weight};
fn main() {
// Create a channel to receive the events.
let (tx, rx) = channel();
// Create a watcher object, delivering debounced events.
// The notification back-end is selected based on the platform.
let mut watcher = watcher(tx, Duration::from_secs(5)).unwrap();
// Add a path to be watched. All files and directories at that path
and
// below will be monitored for changes.
//watcher.watch("/mnt/jobs_testwatch/",
RecursiveMode::NonRecursive).unwrap();
watcher.watch("/mnt/jobs/", RecursiveMode::NonRecursive).unwrap();
let mut sw: SmoothWeight<&str> = SmoothWeight::new();
sw.add("w1", 1);
sw.add("w2", 1);
sw.add("w3", 1);
loop {
match rx.recv() {
Err(e) => println!("watch error: {:?}", e),
Ok(event) => {
println!("{:?}", event);
match event{
notify::DebouncedEvent::NoticeWrite(_) => {}
notify::DebouncedEvent::NoticeRemove(_) => {}
notify::DebouncedEvent::Write(_) => {},
notify::DebouncedEvent::Chmod(_) => {},
notify::DebouncedEvent::Remove(_) => {}
notify::DebouncedEvent::Rename(_, _) => {}
notify::DebouncedEvent::Rescan => {}
notify::DebouncedEvent::Error(_, _) => {},
notify::DebouncedEvent::Create(x) => {
println!("{:?}", x);
let path_gen = Path::new(&x);
let file_stem = path_gen.file_stem().unwrap();
let extension = path_gen.extension();
//println!("filestem = {:?}, extension = {:?}",
file_stem, extension);
if extension != None {
//println!("event = {:?} -- Path = {:?}", op, path);
//println!("filename =>{:?}, extension =>{:?}",
file_stem, extension.unwrap());
//println!("");
//let xml_string = OsStr::new("xml");
match extension.unwrap().to_str(){
Some("xml")=>{
println!("found {:?}.", extension.unwrap());
let s = sw.next().unwrap();
println!("moving to /{} ...", s);
let movepath = format!
("/mnt/jobs/lbalancewatchers/{}/{}.xml",s,
file_stem.to_str().unwrap());
let foo = fs::read_to_string(&x);
match foo{
Ok(_) => {
let a = fs::rename(&x, &movepath);
if let Err(e) = a {
println!("error moving file -->
{}", e)
}
println!("moved {:?} to {:?}",
&file_stem, &movepath)},
//println!("{:#?}", foo.unwrap())},
Err(_) => println!("")
}
//println!("{:#?}", foo)
}
_ => {println!("{:?} is NOT an xml" ,
extension)}
}
};
},
}
}
}
}
}

How to print a response body in actix_web middleware?

I'd like to write a very simple middleware using actix_web framework but it's so far beating me on every front.
I have a skeleton like this:
let result = actix_web::HttpServer::new(move || {
actix_web::App::new()
.wrap_fn(move |req, srv| {
srv.call(req).map(move|res| {
println!("Got response");
// let s = res.unwrap().response().body();
// ???
res
})
})
})
.bind("0.0.0.0:8080")?
.run()
.await;
and I can access ResponseBody type via res.unwrap().response().body() but I don't know what can I do with this.
Any ideas?
This is an example of how I was able to accomplish this with 4.0.0-beta.14:
use std::cell::RefCell;
use std::pin::Pin;
use std::rc::Rc;
use std::collections::HashMap;
use std::str;
use erp_contrib::{actix_http, actix_web, futures, serde_json};
use actix_web::dev::{Service, ServiceRequest, ServiceResponse, Transform};
use actix_web::{HttpMessage, body, http::StatusCode, error::Error ,HttpResponseBuilder};
use actix_http::{h1::Payload, header};
use actix_web::web::{BytesMut};
use futures::future::{ok, Future, Ready};
use futures::task::{Context, Poll};
use futures::StreamExt;
use crate::response::ErrorResponse;
pub struct UnhandledErrorResponse;
impl<S: 'static> Transform<S, ServiceRequest> for UnhandledErrorResponse
where
S: Service<ServiceRequest, Response = ServiceResponse, Error = Error>,
S::Future: 'static,
{
type Response = ServiceResponse;
type Error = Error;
type Transform = UnhandledErrorResponseMiddleware<S>;
type InitError = ();
type Future = Ready<Result<Self::Transform, Self::InitError>>;
fn new_transform(&self, service: S) -> Self::Future {
ok(UnhandledErrorResponseMiddleware { service: Rc::new(RefCell::new(service)), })
}
}
pub struct UnhandledErrorResponseMiddleware<S> {
service: Rc<RefCell<S>>,
}
impl<S> Service<ServiceRequest> for UnhandledErrorResponseMiddleware<S>
where
S: Service<ServiceRequest, Response = ServiceResponse, Error = Error> + 'static,
S::Future: 'static,
{
type Response = ServiceResponse;
type Error = Error;
type Future = Pin<Box<dyn Future<Output = Result<Self::Response, Self::Error>>>>;
fn poll_ready(&self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
self.service.poll_ready(cx)
}
fn call(&self, mut req: ServiceRequest) -> Self::Future {
let svc = self.service.clone();
Box::pin(async move {
/* EXTRACT THE BODY OF REQUEST */
let mut request_body = BytesMut::new();
while let Some(chunk) = req.take_payload().next().await {
request_body.extend_from_slice(&chunk?);
}
let mut orig_payload = Payload::empty();
orig_payload.unread_data(request_body.freeze());
req.set_payload(actix_http::Payload::from(orig_payload));
/* now process the response */
let res: ServiceResponse = svc.call(req).await?;
let content_type = match res.headers().get("content-type") {
None => { "unknown"}
Some(header) => {
match header.to_str() {
Ok(value) => {value}
Err(_) => { "unknown"}
}
}
};
return match res.response().error() {
None => {
Ok(res)
}
Some(error) => {
if content_type.to_uppercase().contains("APPLICATION/JSON") {
Ok(res)
} else {
let error = error.to_string();
let new_request = res.request().clone();
/* EXTRACT THE BODY OF RESPONSE */
let _body_data =
match str::from_utf8(&body::to_bytes(res.into_body()).await?){
Ok(str) => {
str
}
Err(_) => {
"Unknown"
}
};
let mut errors = HashMap::new();
errors.insert("general".to_string(), vec![error]);
let new_response = match ErrorResponse::new(&false, errors) {
Ok(response) => {
HttpResponseBuilder::new(StatusCode::BAD_REQUEST)
.insert_header((header::CONTENT_TYPE, "application/json"))
.body(serde_json::to_string(&response).unwrap())
}
Err(_error) => {
HttpResponseBuilder::new(StatusCode::BAD_REQUEST)
.insert_header((header::CONTENT_TYPE, "application/json"))
.body("An unknown error occurred.")
}
};
Ok(ServiceResponse::new(
new_request,
new_response
))
}
}
}
})
}
}
The extraction of the Request Body is straightforward and similar to how Actix example's illustrate. However, with the update to version Beta 14, pulling the bytes directly from AnyBody has changed with the introduction of BoxedBody. Fortunately, I found a utility function body::to_bytes (use actix_web::body::to_bytes) which does a good job. It's current implementation looks like this:
pub async fn to_bytes<B: MessageBody>(body: B) -> Result<Bytes, B::Error> {
let cap = match body.size() {
BodySize::None | BodySize::Sized(0) => return Ok(Bytes::new()),
BodySize::Sized(size) => size as usize,
// good enough first guess for chunk size
BodySize::Stream => 32_768,
};
let mut buf = BytesMut::with_capacity(cap);
pin!(body);
poll_fn(|cx| loop {
let body = body.as_mut();
match ready!(body.poll_next(cx)) {
Some(Ok(bytes)) => buf.extend_from_slice(&*bytes),
None => return Poll::Ready(Ok(())),
Some(Err(err)) => return Poll::Ready(Err(err)),
}
})
.await?;
Ok(buf.freeze())
}
which I believe should be fine to extract the body in this way, as the body is extracted from the body stream by to_bytes().
If someone has a better way, let me know, but it was a little bit of pain, and I only had recently determined how to do it in Beta 13 when it switched to Beta 14.
This particular example intercepts errors and rewrites them to JSON format if they're not already json format. This would be the case, as an example, if an error occurs outside of a handler, such as parsing JSON in the handler function itself _data: web::Json<Request<'a, LoginRequest>> and not in the handler body. Extracting the Request Body and Response Body is not necessary to accomplish the goal, and is just here for illustration.

Why does reading from a Rusoto S3 stream inside an Actix Web handler cause a deadlock?

I'm writing an application using actix_web and rusoto_s3.
When I run a command outside of an actix request directly from main, it runs fine, and the get_object works as expected. When this is encapsulated inside an actix_web request, the stream is blocked forever.
I have a client that is shared for all requests which is encapsulated into an Arc (this happens in actix data internals).
Full code:
fn index(
_req: HttpRequest,
path: web::Path<String>,
s3: web::Data<S3Client>,
) -> impl Future<Item = HttpResponse, Error = actix_web::Error> {
s3.get_object(GetObjectRequest {
bucket: "my_bucket".to_owned(),
key: path.to_owned(),
..Default::default()
})
.and_then(move |res| {
info!("Response {:?}", res);
let mut stream = res.body.unwrap().into_blocking_read();
let mut body = Vec::new();
stream.read_to_end(&mut body).unwrap();
match process_file(body.as_slice()) {
Ok(result) => Ok(result),
Err(error) => Err(RusotoError::from(error)),
}
})
.map_err(|e| match e {
RusotoError::Service(GetObjectError::NoSuchKey(key)) => {
actix_web::error::ErrorNotFound(format!("{} not found", key))
}
error => {
error!("Error: {:?}", error);
actix_web::error::ErrorInternalServerError("error")
}
})
.from_err()
.and_then(move |img| HttpResponse::Ok().body(Body::from(img)))
}
fn health() -> HttpResponse {
HttpResponse::Ok().finish()
}
fn main() -> std::io::Result<()> {
let name = "rust_s3_test";
env::set_var("RUST_LOG", "debug");
pretty_env_logger::init();
let sys = actix_rt::System::builder().stop_on_panic(true).build();
let prometheus = PrometheusMetrics::new(name, "/metrics");
let s3 = S3Client::new(Region::Custom {
name: "eu-west-1".to_owned(),
endpoint: "http://localhost:9000".to_owned(),
});
let s3_client_data = web::Data::new(s3);
Server::build()
.bind(name, "0.0.0.0:8080", move || {
HttpService::build().keep_alive(KeepAlive::Os).h1(App::new()
.register_data(s3_client_data.clone())
.wrap(prometheus.clone())
.wrap(actix_web::middleware::Logger::default())
.service(web::resource("/health").route(web::get().to(health)))
.service(web::resource("/{file_name}").route(web::get().to_async(index))))
})?
.start();
sys.run()
}
In stream.read_to_end the thread is being blocked and never resolved.
I have tried cloning the client per request and also creating a new client per request, but I've got the same result in all scenarios.
Am I doing something wrong?
It works if I don't use it async...
s3.get_object(GetObjectRequest {
bucket: "my_bucket".to_owned(),
key: path.to_owned(),
..Default::default()
})
.sync()
.unwrap()
.body
.unwrap()
.into_blocking_read();
let mut body = Vec::new();
io::copy(&mut stream, &mut body);
Is this an issue with Tokio?
let mut stream = res.body.unwrap().into_blocking_read();
Check the implementation of into_blocking_read(): it calls .wait(). You shouldn't call blocking code inside a Future.
Since Rusoto's body is a Stream, there is a way to read it asynchronously:
.and_then(move |res| {
info!("Response {:?}", res);
let stream = res.body.unwrap();
stream.concat2().map(move |file| {
process_file(&file[..]).unwrap()
})
.map_err(|e| RusotoError::from(e)))
})
process_file should not block the enclosing Future. If it needs to block, you may consider running it on new thread or encapsulate with tokio_threadpool's blocking.
Note: You can use tokio_threadpool's blocking in your implementation, but I recommend you understand how it works first.
If you are not aiming to load the whole file into memory, you can use for_each:
stream.for_each(|part| {
//process each part in here
//Warning! Do not add blocking code here either.
})
See also:
What is the best approach to encapsulate blocking I/O in future-rs?
Why does Future::select choose the future with a longer sleep period first?

Websocket client message payload "does not live long enough"

I modified an existing websocket client to save the message payload from the websocket server:
fn main()
{
let mut globaltest = "";
// ...some othercode
let receive_loop = thread::spawn(move || {
// Receive loop
for message in receiver.incoming_messages() {
let message: Message = match message {
Ok(m) => m,
Err(e) => {
println!("Receive Loop: {:?}", e);
let _ = tx_1.send(Message::close());
return;
}
};
match message.opcode {
Type::Close => {
// Got a close message, so send a close message and return
let _ = tx_1.send(Message::close());
return;
}
Type::Ping => match tx_1.send(Message::pong(message.payload)) {
// Send a pong in response
Ok(()) => (),
Err(e) => {
println!("Receive Loop: {:?}", e);
return;
}
},
// Say what we received
_ => {
println!("Receive Loop: {:?}", message);
// recive from motion
let buf = &message.payload;
{
let s = match str::from_utf8(buf) {
Ok(v) => v,
Err(e) => panic!("Invalid UTF-8 sequence: {}", e),
};
println!("{}",s);
////>>> problem here <<<
globaltest = s;
}
},
}
}
});
// ...some othercode
}
When I build, I get an error message:
nathaniel#nathaniel-virtual-machine:~/rustcoderep/rsdummywsclient$ sudo cargo build
Compiling rsdummywsclient v0.1.0 (file:///home/nathaniel/rustcoderep/rsdummywsclient)
src/main.rs:94:36: 94:51 error: `message.payload` does not live long enough
src/main.rs:94 let buf = &message.payload;
^~~~~~~~~~~~~~~
note: reference must be valid for the static lifetime...
src/main.rs:75:6: 106:4 note: ...but borrowed value is only valid for the block suffix following statement 0 at 75:5
src/main.rs:75 };
src/main.rs:76 match message.opcode {
src/main.rs:77 Type::Close => {
src/main.rs:78 // Got a close message, so send a close message and return
src/main.rs:79 let _ = tx_1.send(Message::close());
src/main.rs:80 return;
...
error: aborting due to previous error
I have no idea why. I tried a lot of solutions like Arc and Mutex, but none of them work :(
When I remove globaltest = s, the code builds and runs without problems. So I tried to write a simpler example:
use std::str;
use std::thread;
fn main() {
let mut y = 2;
let receive_loop = thread::spawn(move || {
let x = 1;
y = x;
println!("tt{:?}",y);
});
let receive_loop2 = thread::spawn(move || {
println!("tt2{:?}",y);
});
println!("{:?}",y);
}
This works... with almost the same structure.
Here is the full code, only a little different from the rust-websocket client sample:
extern crate websocket;
fn main() {
use std::thread;
use std::sync::mpsc::channel;
use std::io::stdin;
use std::str;
use websocket::{Message, Sender, Receiver};
use websocket::message::Type;
use websocket::client::request::Url;
use websocket::Client;
let mut globaltest ="";
let url = Url::parse("ws://127.0.0.1:2794").unwrap();
println!("Connecting to {}", url);
let request = Client::connect(url).unwrap();
let response = request.send().unwrap(); // Send the request and retrieve a response
println!("Validating response...");
response.validate().unwrap(); // Validate the response
println!("Successfully connected");
let (mut sender, mut receiver) = response.begin().split();
let (tx, rx) = channel();
let tx_1 = tx.clone();
let send_loop = thread::spawn(move || {
loop {
// Send loop
let message: Message = match rx.recv() {
Ok(m) => m,
Err(e) => {
println!("Send Loop: {:?}", e);
return;
}
};
match message.opcode {
Type::Close => {
let _ = sender.send_message(&message);
// If it's a close message, just send it and then return.
return;
},
_ => (),
}
// Send the message
match sender.send_message(&message) {
Ok(()) => (),
Err(e) => {
println!("Send Loop: {:?}", e);
let _ = sender.send_message(&Message::close());
return;
}
}
}
});
let receive_loop = thread::spawn(move || {
// Receive loop
for message in receiver.incoming_messages() {
let message: Message = match message {
Ok(m) => m,
Err(e) => {
println!("Receive Loop: {:?}", e);
let _ = tx_1.send(Message::close());
return;
}
};
match message.opcode {
Type::Close => {
// Got a close message, so send a close message and return
let _ = tx_1.send(Message::close());
return;
}
Type::Ping => match tx_1.send(Message::pong(message.payload)) {
// Send a pong in response
Ok(()) => (),
Err(e) => {
println!("Receive Loop: {:?}", e);
return;
}
},
// Say what we received
_ => {
println!("Receive Loop: {:?}", message);
// recive from motion
let buf = &message.payload;
{
let s = match str::from_utf8(buf) {
Ok(v) => v,
Err(e) => panic!("Invalid UTF-8 sequence: {}", e),
};
println!("{}",s);
globaltest = s;
}
},
}
}
});
loop {
let mut input = String::new();
stdin().read_line(&mut input).unwrap();
let trimmed = input.trim();
let message = match trimmed {
"/close" => {
// Close the connection
let _ = tx.send(Message::close());
break;
}
// Send a ping
"/ping" => Message::ping(b"PING".to_vec()),
// Otherwise, just send text
_ => Message::text(trimmed.to_string()),
};
match tx.send(message) {
Ok(()) => (),
Err(e) => {
println!("Main Loop: {:?}", e);
break;
}
}
}
// We're exiting
println!("Waiting for child threads to exit");
let _ = send_loop.join();
let _ = receive_loop.join();
println!("Exited");
}
#fjh thanks your reply!
I modified my code into this, changing globaltest in receive_loop and accessing it from the main thread loop. I still get a confusing error, even after spending three hours I still cannot solve it :(
fn main() {
let mut globaltest:Arc<Mutex<String>> = Arc::new(Mutex::new(String::from("")));
//some other code...
let receive_loop = thread::spawn(move || {
// Receive loop
for message in receiver.incoming_messages() {
let message: Message = match message {
Ok(m) => m,
Err(e) => {
println!("Receive Loop: {:?}", e);
let _ = tx_1.send(Message::close());
return;
}
};
match message.opcode {
Type::Close => {
// Got a close message, so send a close message and return
let _ = tx_1.send(Message::close());
return;
}
Type::Ping => match tx_1.send(Message::pong(message.payload)) {
// Send a pong in response
Ok(()) => (),
Err(e) => {
println!("Receive Loop: {:?}", e);
return;
}
},
// Say what we received
_ => {
let mut globaltest_child = globaltest.lock().unwrap();
println!("Receive Loop: {:?}", message);
// recive from motion
let buf = &message.payload;
{
let s = match str::from_utf8(buf) {
Ok(v) => v,
Err(e) => panic!("Invalid UTF-8 sequence: {}", e),
};
{
//>>> if I do like this, globaltest value will same like globaltest_child??
*globaltest_child = String::from(s);
println!("{:?}",globaltest_child.clone());
}
}
},
}
}
});
loop {
let message = Message::text("mtconnect");
match tx.send(message) {
Ok(()) => (),
Err(e) => {
println!("Main Loop: {:?}", e);
break;
}
}
///>>> problem here////
println!("{:?}",globaltest.clone());
thread::sleep(time::Duration::from_millis(3000));
}
}
The compiler always tells me:
athaniel#nathaniel-virtual-machine:~/rustcoderep/rsadapter$ sudo cargo run
Compiling rsadapter v0.1.0 (file:///home/nathaniel/rustcoderep/rsadapter)
src/main.rs:166:25: 166:35 error: use of moved value: `globaltest` [E0382]
src/main.rs:166 println!("{:?}",globaltest.clone());
^~~~~~~~~~
<std macros>:2:25: 2:56 note: in this expansion of format_args!
<std macros>:3:1: 3:54 note: in this expansion of print! (defined in <std macros>)
src/main.rs:166:9: 166:45 note: in this expansion of println! (defined in <std macros>)
src/main.rs:166:25: 166:35 help: run `rustc --explain E0382` to see a detailed explanation
src/main.rs:102:35: 153:3 note: `globaltest` moved into closure environment here because it has type `alloc::arc::Arc<std::sync::mutex::Mutex<collections::string::String>>`, which is non-copyable
src/main.rs:102 let receive_loop = thread::spawn(move || {
src/main.rs:103
src/main.rs:104
src/main.rs:105 // Receive loop
src/main.rs:106 for message in receiver.incoming_messages() {
src/main.rs:107 let message: Message = match message {
...
src/main.rs:102:35: 153:3 help: perhaps you meant to use `clone()`?
error: aborting due to previous error
I still can't access globaltest in another thread.
When I remove globaltest = s, the code builds and runs without problems.
Yes, because that assignment is not safe to do. You're trying to modify a local variable declared in the main thread from within your other thread. That could lead to all sorts of problems, like data races, which is why the compiler won't let you do it.
It's difficult to say what the best way to fix this is without knowing more about what you want to do. That being said, you could probably fix this by making globaltest an Arc<Mutex<String>> instead of a &str, so you could safely access it from both threads.
Whenever you have a problem, you should spend time reducing the problem. This helps you understand where the problem is and is likely to remove extraneous details that may be confusing you.
In this case, you could start by removing all of the other match arms, replacing them with panic! calls. Then try replacing libraries with your own code, then eventually just standard library code. Eventually you will get to something much smaller that reproduces the problem.
This is called creating an MCVE, and is highly encouraged when you ask a question on Stack Overflow. However, it's 100% useful to yourself whenever you have a problem you don't yet understand. As a professional programmer, you are expected to do this legwork.
Here's one possible MCVE I was able to create:
use std::{str, thread};
fn main() {
let mut global_string = "one";
let child = thread::spawn(move || {
let payload = b"Some allocated raw data".to_vec();
let s = str::from_utf8(&payload).unwrap();
global_string = s;
});
println!("{}", global_string);
}
And it produces the same error ("reference must be valid for the static lifetime"). Specifically, the global_string variable is a &'static str, while s is a &str with a lifetime equivalent to the payload it is borrowed from. Simply put, the payload will be deallocated before the thread exits, which means that the string would point to invalid memory, which could cause a crash or security vulnerability. This is a class of errors that Rust prevents against.
This is what fjh is telling you.
Instead, you need to be able to ensure that the string will continue to live outside of the thread. The simplest way is to allocate memory that it will control. In Rust, this is a String:
use std::{str, thread};
fn main() {
let mut global_string = "one".to_string();
let child = thread::spawn(move || {
let payload = b"Some allocated raw data".to_vec();
let s = str::from_utf8(&payload).unwrap();
global_string = s.to_string();
});
println!("{}", global_string);
}
Now we've changed the error to "use of moved value: global_string", because we are transferring ownership of the String from main to the thread. We could try to fix that by cloning the string before we give it to the thread, but then we wouldn't be changing the outer one that we want.
Even if we could set the value in the outer thread, we'd get in trouble because we'd be creating a race condition where two threads are acting in parallel on one piece of data. You have no idea what state the variable is in when you try to access it. That's where a Mutex comes in. It makes it so that multiple threads can safely share access to one piece of data, one at a time.
However, you still have the problem that only one thread can own the Mutex at a time, but you need two threads to own it. That's where Arc comes in. An Arc can be cloned and the clone can be given to another thread. Both Arcs then point to the same value and ensure it is cleaned up when nothing is using it any more.
Note that we have to clone the Arc<Mutex<String>>> before we spawn the thread because we are transferring ownership of it into the thread:
use std::{str, thread};
use std::sync::{Arc, Mutex};
fn main() {
let global_string = Arc::new(Mutex::new("one".to_string()));
let inner = global_string.clone();
let child = thread::spawn(move || {
let payload = b"Some allocated raw data".to_vec();
let s = str::from_utf8(&payload).unwrap();
*inner.lock().unwrap() = s.to_string();
});
child.join().unwrap();
let s = global_string.lock().unwrap();
println!("{}", *s);
}

Resources