Perpetual tokio TCP stream (client) - rust

Foreword, you can skip to the next section
So I decided to try Rust for my new relatively small project, because I like that it produces a single executable which is easy to deploy on my ARM-based target with relatively little resources in terms of RAM and disk space. I have no previous experience with Rust, but a lot of experience with other languages, and so far I am getting somewhat disappointed. It seems that for many Rust libraries and probably Rust itself, the APIs are changing so fast that 90% of sample code found online will not compile with latest versions of libraries like tokio, tokio-util etc. Also, the documentation is often misleading. For example, if you Google for LinesCodec it will show up in tokio_io::codec::LinesCodec, tokio::codec::LinesCodec, tokio_codec::LinesCodec and tokio_util::codec::LinesCodec, which ultimately seems to be one to use as of today. Same confusion goes for other things like FramedRead, which had an and_then and map member functions in some versions, but they doesn't seem to exist in the latest version. Lastly, the amount of questions and answers related to Rust on SO is far less than for other languages I've used, which makes it harder to start using Rust. What I'm trying to do for the past 2 days is solved relatively easily in most programming languages, and I believe, there must be an easy solution in Rust as well, but so far I had no success.
Question itself
I need to connect a TCP client to a remote server and indefinitely read and process data line by line as it comes in. This needs to be done asynchronously because the same process also acts as a HTTP server, so I'm using tokio.
As far as I understand, the somewhat common way is to use TcpStream, slit it to RX/TX parts, then I'm trying to hook up a LinesCodec (with FramedRead) but I'm unable to hook all these together without getting compilation errors.
[dependencies]
futures = "*"
hyper = "*"
tokio = { version = "*", features = ["full"] }
tokio-util = "0.2.0"
tokio-modbus = { version = "*", features = ["tcp", "server", "tcp-server-unstable"], git = "https://github.com/slowtec/tokio-modbus" }
let stream = TcpStream::connect("172.16.100.10:1001").await.unwrap();
let transport = FramedRead::new(stream, LinesCodec::new()); // need to split?
/* ... what to do next to process incoming data line-by-line ...? */
So far I came with this solution, not sure how good it is though
tokio::spawn(async {
let connection = TcpStream::connect("172.16.100.10:1001").await.unwrap();
let mut reader = BufReader::new(connection);
loop {
let mut line = String::new();
reader.read_line(&mut line).await.unwrap();
println!("{}", line);
}
});

Simple app with a Cargo.toml like:
[dependencies]
tokio = { version = "0.3", features = ["full"] }
tokio-util = { version = "0.4", features = ["codec"] }
and a main.rs like:
use tokio::net::{TcpListener, TcpStream };
use tokio_util::codec::{ Framed, LinesCodec };
use tokio::stream::StreamExt;
use std::error::Error;
use std::net::{IpAddr, Ipv4Addr, SocketAddr};
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let args: Vec<String> = std::env::args().collect();
if args[1] == "server"
{
let local_addr: String = format!("{}{}",":::",args[2]); // app <server | client> <port>
let listener = TcpListener::bind(&local_addr).await?;
while let Ok((socket, peer)) = listener.accept().await {
tokio::spawn(async move {
println!("Client Connected from: {}",peer.to_string());
let mut client = Framed::new(socket, LinesCodec::new_with_max_length(1024));
while let Some(Ok(line)) = client.next().await {
println!("{}", line);
}
});
}
}
else if args[1] == "client"
{
let port = args[2].parse::<u16>().unwrap(); // app client <port>
let saddr = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(127, 0, 0, 1)), port);
let conn = TcpStream::connect(saddr).await?;
let mut server = Framed::new(conn, LinesCodec::new_with_max_length(1024));
while let Some(Ok(line)) = server.next().await {
println!("{}", line);
}
}
Ok({})
}
To run as server:
cargo run server 8080
(in another shell) nc localhost 8080
to run as client:
(in another shell) nc -l -p 8080
cargo run client 8080

There's an example of how to use LineCodec in the chat example program. The relevant section is the "process" function.
A smaller example (server, but the principle is the same for a client) that reverses each line that it receives and echos it back, with a maximum buffer size of 5000:
use tokio::net::TcpListener;
use tokio::stream::StreamExt;
use tokio_util::codec::{Framed, LinesCodec};
extern crate unicode_segmentation;
use unicode_segmentation::UnicodeSegmentation;
use futures::SinkExt;
async fn talk(sock: tokio::net::TcpStream) {
let mut lines = Framed::new(sock, LinesCodec::LinesCodec::new_with_max_length(5000));
while let Some(Ok(line)) = lines.next().await {
let rev = line.graphemes(true).rev().collect::<String>();
if let Err(_e) = lines.send(rev).await {
break;
}
}
}
#[tokio::main]
async fn main() {
let addr = "127.0.0.1:6200";
let mut listener = TcpListener::bind(addr).await.unwrap();
let mut incoming = listener.incoming();
while let Some(conn) = incoming.next().await {
match conn {
Err(e) => eprintln!("accept failed = {:?}", e),
Ok(sock) => {
tokio::spawn(talk(sock));
}
}
}
}
An issue with the read_line workaround is that read_line doesn't give you an option to limit the line length, so if you have untrusted input, it could cause your program to consume arbitrary amounts of memory. LineCodec does give an option to limit the line length.

Related

Can tokio::select allow defining arbitrary numbre of branches

I am writing an echo server that is able to listen on multiple ports. My working server below relies on select to accept connections from 2 different listeners
However, instead of defining the listeners as individual variables, is it possible to define select branches based on a Vec<TcpListener> ?
use tokio::{io, net, select, spawn};
#[tokio::main]
async fn main() {
let listener1 = net::TcpListener::bind("127.0.0.1:8001").await.unwrap();
let listener2 = net::TcpListener::bind("127.0.0.1:8002").await.unwrap();
loop {
let (conn, _) = select! {
v = listener1.accept() => v.unwrap(),
v = listener2.accept() => v.unwrap(),
};
spawn(handle(conn));
}
}
async fn handle(mut conn: net::TcpStream) {
let (mut read, mut write) = conn.split();
io::copy(&mut read, &mut write).await.unwrap();
}
While futures::future::select_all() works, it is not very elegant (IMHO) and it creates an allocation for each round. A better solution is to use streams (note this also allocates on each round, but this allocates much less):
use tokio::{io, net, spawn};
use tokio_stream::wrappers::TcpListenerStream;
use futures::stream::{StreamExt, SelectAll};
#[tokio::main]
async fn main() {
let mut listeners = SelectAll::new();
listeners.push(TcpListenerStream::new(net::TcpListener::bind("127.0.0.1:8001").await.unwrap()));
listeners.push(TcpListenerStream::new(net::TcpListener::bind("127.0.0.1:8002").await.unwrap()));
while let Some(conn) = listeners.next().await {
let conn = conn.unwrap();
spawn(handle(conn));
}
}
You can use the select_all function from the futures crate, which takes an iterator of futures and awaits any of them (instead of all of them, like join_all does):
use futures::{future::select_all, FutureExt};
use tokio::{io, net, select, spawn};
#[tokio::main]
async fn main() {
let mut listeners = [
net::TcpListener::bind("127.0.0.1:8001").await.unwrap(),
net::TcpListener::bind("127.0.0.1:8002").await.unwrap(),
];
loop {
let (result, index, _) = select_all(
listeners
.iter_mut()
// note: `FutureExt::boxed` is called here because `select_all`
// requires the futures to be pinned
.map(|listener| listener.accept().boxed()),
)
.await;
let (conn, _) = result.unwrap();
spawn(handle(conn));
}
}

How to create threads in a for loop and get the return value from each?

I am writing a program that pings a set of targets 100 times, and stores each RTT value returned from the ping into a vector, thus giving me a set of RTT values for each target. Say I have n targets, I would like all of the pinging to be done concurrently. The rust code looks like this:
let mut sample_rtts_map = HashMap::new();
for addr in targets.to_vec() {
let mut sampleRTTvalues: Vec<f32> = vec![];
//sample_rtts_map.insert(addr, sampleRTTvalues);
thread::spawn(move || {
while sampleRTTvalues.len() < 100 {
let sampleRTT = ping(addr);
sampleRTTvalues.push(sampleRTT);
// thread::sleep(Duration::from_millis(5000));
}
});
}
The hashmap is used to tell which vector of values belongs to which target. The problem is, how do I retrieve the updated sampleRTTvalues from each thread after the thread is done executing? I would like something like:
let (name, sampleRTTvalues) = thread::spawn(...)
The name, being the name of the thread, and sampleRTTvalues being the vector. However, since I'm creating threads in a for loop, each thread is being instantiated the same way, so how I differentiate them?
Is there some better way to do this? I've looked into schedulers, future, etc., but it seems my case can just be done with simple threads.
I go the desired behavior with the following code:
use std::thread;
use std::sync::mpsc;
use std::collections::HashMap;
use rand::Rng;
use std::net::{Ipv4Addr,Ipv6Addr,IpAddr};
const RTT_ONE: IpAddr = IpAddr::V4(Ipv4Addr::new(127,0,0,1));
const RTT_TWO: IpAddr = IpAddr::V6(Ipv6Addr::new(0,0,0,0,0,0,0,1));
const RTT_THREE: IpAddr = IpAddr::V4(Ipv4Addr::new(127,0,1,1));//idk how ip adresses work, forgive if this in invalid but you get the idea
fn ping(address: IpAddr) -> f32 {
rand::thread_rng().gen_range(5.0..107.0)
}
fn main() {
let targets = [RTT_ONE,RTT_TWO,RTT_THREE];
let mut sample_rtts_map: HashMap<IpAddr,Vec<f32>> = HashMap::new();
for addr in targets.into_iter() {
let (sample_values,moved_values) = mpsc::channel();
let mut sampleRTTvalues: Vec<f32> = vec![];
thread::spawn(move || {
while sampleRTTvalues.len() < 100 {
let sampleRTT = ping(addr);
sampleRTTvalues.push(sampleRTT);
//thread::sleep(Duration::from_millis(5000));
}
});
sample_rtts_map.insert(addr,moved_values.recv().unwrap());
}
}
note that the use rand::Rng can be removed when implementing, as it is only so the example works. what this does is pass data from the spawned thread to the main thread, and in the method used it waits until the data is ready before adding it to the hash map. If this is problematic (takes a long time, etc.) then you can use try_recv instead of recv which will add an error / option type that will return a recoverable error if the value is ready when unwrapped, or return the value if it's ready
You can use a std::sync::mpsc channel to collect your data:
use std::collections::HashMap;
use std::sync::mpsc::channel;
use std::thread;
fn ping(_: &str) -> f32 { 0.0 }
fn main() {
let targets = ["a", "b"]; // just for example
let mut sample_rtts_map = HashMap::new();
let (tx, rx) = channel();
for addr in targets {
let tx = tx.clone();
thread::spawn(move || {
for _ in 0..100 {
let sampleRTT = ping(addr);
tx.send((addr, sampleRTT));
}
});
}
drop(tx);
// exit loop when all thread's tx have dropped
while let Ok((addr, sampleRTT)) = rx.recv() {
sample_rtts_map.entry(addr).or_insert(vec![]).push(sampleRTT);
}
println!("sample_rtts_map: {:?}", sample_rtts_map);
}
This will run all pinging threads simultaneously, and collect data in main thread synchronously, so that we can avoid using locks. Do not forget to drop sender in main thread after cloning to all pinging threads, or the main thread will hang forever.

How to add tracing to a Rust microservice?

I built a microservice in Rust. I receive messages, request a document based on the message, and call a REST api with the results. I built the REST api with warp and send out the result with reqwest. We use jaeger for tracing and the "b3" format. I have no experience with tracing and am a Rust beginner.
Question: What do I need to add the the warp / reqwest source below to propagate the tracing information and add my own span?
My version endpoint (for simplicity) looks like:
pub async fn version() -> Result<impl warp::Reply, Infallible> {
Ok(warp::reply::with_status(VERSION, http::StatusCode::OK))
}
I assume I have to extract e.g. the traceid / trace information here.
A reqwest call I do looks like this:
pub async fn get_document_content_as_text(
account_id: &str,
hash: &str,
) -> Result<String, Box<dyn std::error::Error>> {
let client = reqwest::Client::builder().build()?;
let res = client
.get(url)
.bearer_auth(TOKEN)
.send()
.await?;
if res.status().is_success() {}
let text = res.text().await?;
Ok(text)
}
I assume I have to add the traceid / trace information here.
You need to add a tracing filter into your warp filter pipeline.
From the documentation example:
use warp::Filter;
let route = warp::any()
.map(warp::reply)
.with(warp::trace(|info| {
// Create a span using tracing macros
tracing::info_span!(
"request",
method = %info.method(),
path = %info.path(),
)
}));
I'll assume that you're using tracing within your application and using opentelemetry and opentelemetry-jaeger to wire it up to an external service. The specific provider you're using doesn't matter. Here's a super simple setup to get that all working that I'll assume you're using on both applications:
# Cargo.toml
[dependencies]
opentelemetry = "0.17.0"
opentelemetry-jaeger = "0.16.0"
tracing = "0.1.33"
tracing-subscriber = { version = "0.3.11", features = ["env-filter"] }
tracing-opentelemetry = "0.17.2"
reqwest = "0.11.11"
tokio = { version = "1.21.1", features = ["macros", "rt", "rt-multi-thread"] }
warp = "0.3.2"
opentelemetry::global::set_text_map_propagator(opentelemetry_jaeger::Propagator::new());
tracing_subscriber::registry()
.with(tracing_opentelemetry::layer().with_tracer(
opentelemetry_jaeger::new_pipeline()
.with_service_name("client") // or "server"
.install_simple()
.unwrap())
).init();
Let's say the "client" application is set up like so:
#[tracing::instrument]
async fn call_hello() {
let client = reqwest::Client::default();
let _resp = client
.get("http://127.0.0.1:3030/hello")
.send()
.await
.unwrap()
.text()
.await
.unwrap();
}
#[tokio::main]
async fn main() {
// ... initialization above ...
call_hello().await;
}
The traces produced by the client are a bit chatty because of other crates but fairly simple, and does not include the server-side:
Let's say the "server" application is set up like so:
#[tracing::instrument]
fn hello_handler() -> &'static str {
tracing::info!("got hello message");
"hello world"
}
#[tokio::main]
async fn main() {
// ... initialization above ...
let routes = warp::path("hello")
.map(hello_handler);
warp::serve(routes).run(([127, 0, 0, 1], 3030)).await;
}
Likewise, the traces produced by the server are pretty bare-bones:
The key part to marrying these two traces is to declare the client-side trace as the parent of the server-side trace. This can be done over HTTP requests with the traceparent and tracestate headers as designed by the W3C Trace Context Standard. There is a TraceContextPropagator available from the opentelemetry crate that can be used to "extract" and "inject" these values (though as you'll see, its not very easy to work with since it only works on HashMap<String, String>s).
For the "client" to send these headers, you'll need to:
get the current tracing Span
get the opentelemetry Context from the Span (if you're not using tracing at all, you can skip the first step and use Context::current() directly)
create the propagator and fields to propagate into and "inject" then from the Context
use those fields as headers for reqwest
#[tracing::instrument]
async fn call_hello() {
let span = tracing::Span::current();
let context = span.context();
let propagator = TraceContextPropagator::new();
let mut fields = HashMap::new();
propagator.inject_context(&context, &mut fields);
let headers = fields
.into_iter()
.map(|(k, v)| {(
HeaderName::try_from(k).unwrap(),
HeaderValue::try_from(v).unwrap(),
)})
.collect();
let client = reqwest::Client::default();
let _resp = client
.get("http://127.0.0.1:3030/hello")
.headers(headers)
.send()
.await
.unwrap()
.text()
.await
.unwrap();
}
For the "server" to make use of those headers, you'll need to:
pull them out from the request and store them in a HashMap
use the propagator to "extract" the values into a Context
set that Context as the parent of the current tracing Span (if you didn't use tracing, you could .attach() it instead)
#[tracing::instrument]
fn hello_handler(traceparent: Option<String>, tracestate: Option<String>) -> &'static str {
let fields: HashMap<_, _> = [
dbg!(traceparent).map(|value| ("traceparent".to_owned(), value)),
dbg!(tracestate).map(|value| ("tracestate".to_owned(), value)),
]
.into_iter()
.flatten()
.collect();
let propagator = TraceContextPropagator::new();
let context = propagator.extract(&fields);
let span = tracing::Span::current();
span.set_parent(context);
tracing::info!("got hello message");
"hello world"
}
#[tokio::main]
async fn main() {
// ... initialization above ...
let routes = warp::path("hello")
.and(warp::header::optional("traceparent"))
.and(warp::header::optional("tracestate"))
.map(hello_handler);
warp::serve(routes).run(([127, 0, 0, 1], 3030)).await;
}
With all that, hopefully your traces have now been associated with one another!
Full code is available here and here.
Please, someone let me know if there is a better way! It seems ridiculous to me that there isn't better integration available. Sure some of this could maybe be a bit simpler and/or wrapped up in some nice middleware for your favorite client and server of choice... But I haven't found a crate or snippet of that anywhere!

What is the idiomatic way in Rust of assigning values of different types to a variable, depending on the compilation target OS?

I'm working on a codebase which binds to a Tokio socket and manages a TCP connection. In production, it binds to an AF_VSOCK using the tokio-vsock crate.
While developing locally on Mac, the AF_VSOCK API isn't available as there is no hypervisor -> VM connection — it's just being run natively using cargo run.
When running locally, I have been creating a standard tokio::net::TcpListener struct and in production I have been creating a tokio_vsock::VsockListener. Both structs are mostly interchangeable and expose the same methods. The rest of the code works perfectly, regardless of which struct is being used.
So far, I have just kept both structs and simply commented out the one that isn't needed locally — this is clearly not "good practice". My code is below:
#[tokio::main]
async fn main() -> Result<(), ()> {
// Production AF_VSOCK listener (comment out locally)
let mut listener = tokio_vsock::VsockListener::bind(
&SockAddr::Vsock(
VsockAddr::new(
VMADDR_CID_ANY,
LISTEN_PORT,
)
)
)
.expect("Unable to bind AF_VSOCK listener");
// Local TCP listener (comment out in production)
let mut listener = tokio::net::TcpListener::bind(
std::net::SocketAddr::new(
std::net::IpAddr::V4(
std::net::Ipv4Addr::new(0, 0, 0, 0)
),
LISTEN_PORT as u16,
)
)
.await
.expect("Unable to bind TCP listener");
// This works regardless of which listener is used
let mut incoming = listener.incoming();
while let Some(socket) = incoming.next().await {
match socket {
Ok(mut stream) => {
// Do something
}
}
}
Ok(())
}
I tried using the cfg!() macro with target_os set as the condition, but the compiler complained that the types returned by both bind() methods were mismatched.
My question is: What is the idiomatic way in Rust of assigning different values with different types to a variable, depending on the compilation target OS?
There multiple options. The easiest and a very common one in regards to usage in the stdlib itself is using a #[cfg] macro (instead of cfg!(). The following code snippet clarifies it's usage:
struct Linux;
impl Linux {
fn x(&self) -> Linux {
println!("Linux");
Linux
}
}
struct Windows;
impl Windows {
fn x(&self) -> Windows {
println!("Windows");
Windows
}
}
fn main() {
#[cfg(not(target_os = "linux"))]
let obj = {
let obj = Linux;
obj
};
#[cfg(not(target_os = "windows"))]
let obj = {
let obj = Windows;
obj
};
let _ = obj.x();
}
(see https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=7088980d24c4a960c2158b091899d24d).
In your case this would be (untested):
#[tokio::main]
async fn main() -> Result<(), ()> {
#[cfg(target_os = "linux")]
let mut listener = tokio_vsock::VsockListener::bind(
&SockAddr::Vsock(
VsockAddr::new(
VMADDR_CID_ANY,
LISTEN_PORT,
)
)
)
.expect("Unable to bind AF_VSOCK listener");
#[cfg(target_os = "Mac")]
let mut listener = tokio::net::TcpListener::bind(
std::net::SocketAddr::new(
std::net::IpAddr::V4(
std::net::Ipv4Addr::new(0, 0, 0, 0)
),
LISTEN_PORT as u16,
)
)
.await
.expect("Unable to bind TCP listener");
...
}
Check https://doc.rust-lang.org/reference/conditional-compilation.html for available conditions, including feature flags, in case target_os is not applicable enough.
The major difference between #[cfg] and cfg!() is that cfg! does not remove code. According to it's documentation: "cfg!, unlike #[cfg], does not remove any code and only evaluates to true or false". Due to that you get a compile error while using #[cfg] is more akin to if-defs in C/C++ and remove the unused code, hence the compiler never sees the type mismatch.

How to write using tokio Framed LinesCodec?

The following code reads from my server successfully, however I cannot seem to get the correct syntax or semantics to write back to the server when a particular line command is recognized. Do I need to create a FramedWriter? Most examples I have found split the socket but that seems overkill I expect the codec to be able to handle the bi-directional io by providing some async write method.
# Cargo.toml
[dependencies]
tokio = { version = "0.3", features = ["full"] }
tokio-util = { version = "0.4", features = ["codec"] }
//! main.rs
use tokio::net::{ TcpStream };
use tokio_util::codec::{ Framed, LinesCodec };
use tokio::stream::StreamExt;
use std::error::Error;
use std::net::{IpAddr, Ipv4Addr, SocketAddr};
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let saddr = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(127, 0, 0, 1)), 8081);
let conn = TcpStream::connect(saddr).await?;
let mut server = Framed::new(conn, LinesCodec::new_with_max_length(1024));
while let Some(Ok(line)) = server.next().await {
match line.as_str() {
"READY" => println!("Want to write a line to the stream"),
_ => println!("{}", line),
}
}
Ok({})
}
According to the documentation, Framed implements Stream and Sink traits. Sink defines only the bare minimum of low-level sending methods. To get the high-level awaitable methods like send() and send_all(), you need to use the SinkExt extension trait.
For example (playground):
use futures::sink::SinkExt;
// ...
while let Some(Ok(line)) = server.next().await {
match line.as_str() {
"READY" => server.send("foo").await?,
_ => println!("{}", line),
}
}

Resources