How can I put an async function into a Vec in Rust? - rust

I need to put some futures in a Vec for later joining. However if I try to collect it using an iterator, the compiler doesn't seem to be able to determine the type for the vector.
I'm trying to create a command line utility that accepts an arbitrary number of IP addresses, communicates with those remotes and collects the results for printing. The communication function works well, I've cut down the program to show the failure I need to understand.
use futures::future::join_all;
use itertools::Itertools;
use std::net::SocketAddr;
use std::str::from_utf8;
use std::fmt;
#[tokio::main(flavor = "current_thread")]
pub async fn main() -> Result<(), Box<dyn std::error::Error>> {
let socket: Vec<SocketAddr> = vec![
"192.168.20.33:502".parse().unwrap(),
"192.168.20.34:502".parse().unwrap(),];
let async_vec = vec![
MyStruct::get(socket[0]),
MyStruct::get(socket[1]),];
// The above 3 lines happen to work to build a Vec because there are
// 2 sockets. But I need to build a Vec to join_all from an arbitary
// number of addresses. Why doesn't the line below work instead?
//let async_vec = socket.iter().map(|x| MyStruct::get(*x)).collect();
let rt = join_all(async_vec).await;
let results = rt.iter().map(|x| x.as_ref().unwrap().to_string()).join("\n");
let mut rvec: Vec<String> = results.split("\n").map(|x| x.to_string()).collect();
rvec.sort_by(|a, b| a[15..20].cmp(&b[15..20]));
println!("{}", rvec.join("\n"));
Ok(())
}
struct MyStruct {
serial: [u8; 12],
placeholder: String,
}
impl fmt::Display for MyStruct {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
let serial = match from_utf8(&self.serial) {
Ok(v) => v,
Err(_) => "(invalid)",
};
let lines = (1..4).map(|x| format!("{}, line{}, {}", serial, x, self.placeholder)).join("\n");
write!(f, "{}", lines)
}
}
impl MyStruct {
pub async fn get(sockaddr: SocketAddr) -> Result<MyStruct, Box<dyn std::error::Error>> {
let char = sockaddr.ip().to_string().chars().last().unwrap();
let rv = MyStruct{serial: [char as u8;12], placeholder: sockaddr.to_string(), };
Ok(rv)
}
}

This line:
let async_vec = socket.iter().map(|x| MyStruct::get(*x)).collect();
doesn't work because the compiler can't know that you want to collect everything into a Vec. You might want to collect into some other container (e.g. a linked list or a set). Therefore you need to tell the compiler the kind of container you want with:
let async_vec = socket.iter().map(|x| MyStruct::get(*x)).collect::<Vec::<_>>();
or:
let async_vec: Vec::<_> = socket.iter().map(|x| MyStruct::get(*x)).collect();

Related

How do I read CSV data without knowing the structure at compile time?

I'm pretty new to Rust and trying to implement some kind of database. Users should create tables by giving a table name, a vector of column names and a vector of column types (realized over an enum). Filling tables should be done by specifying csv files. However, this requires the structure of the table rows to be specified at compile time, like shown in the basic example:
#[derive(Debug, Deserialize, Eq, PartialEq)]
struct Row {
key: u32,
name: String,
comment: String
}
use std::error::Error;
use csv::ReaderBuilder;
use serde::Deserialize;
use std::fs;
fn read_from_file(path: &str) -> Result<(), Box<dyn Error>> {
let data = fs::read_to_string(path).expect("Unable to read file");
let mut rdr = ReaderBuilder::new()
.has_headers(false)
.delimiter(b'|')
.from_reader(data.as_bytes());
let mut iter = rdr.deserialize();
if let Some(result) = iter.next() {
let record:Row = result?;
println!("{:?}", record);
Ok(())
} else {
Err(From::from("expected at least one record but got none"))
}
}
Is there a possibility to use the generic table information instead of the "Row"-struct to cast the results from the deserialization? Is it possible to simply allocate memory according to the combined sizes of the column types and parse the records in? I would do something like this in C...
Is there a possibility to use the generic table information instead of the "Row"-struct to cast the results from the deserialization?
All generics replaced with concrete types at compile time. If you do not know types you will need in runtime, "generics" is not what you need.
Is it possible to simply allocate memory according to the combined sizes of the column types and parse the records in? I would do something like this in C...
I suggest using Box<dyn Any> instead, to be able to store reference of any type and, still, know what type it is.
Maintenance cost for this approach is pretty high. You have to manage each possible value type everywhere you want to use a cell's value. On the other hand, you do not need to parse value each time, just make some type checks in runtime.
I have used std::any::TypeId to identify type, but it can not be used in match expressions. You can consider using custom enum as type identifier.
use std::any::{Any, TypeId};
use std::io::Read;
use csv::Reader;
#[derive(Default)]
struct Table {
name: String,
headers: Vec<(String, TypeId)>,
data: Vec<Vec<Box<dyn Any>>>,
}
impl Table {
fn add_header(&mut self, header: String, _type: TypeId) {
self.headers.push((header, _type));
}
fn populate_data<R: Read>(
&mut self,
rdr: &mut Reader<R>,
) -> Result<(), Box<dyn std::error::Error>> {
for record in rdr.records() {
let record = record?;
let mut row: Vec<Box<dyn Any>> = vec![];
for (&(_, type_id), value) in self.headers.iter().zip(record.iter()) {
if type_id == TypeId::of::<u32>() {
row.push(Box::new(value.parse::<u32>()?));
} else if type_id == TypeId::of::<String>() {
row.push(Box::new(value.to_owned()));
}
}
self.data.push(row);
}
Ok(())
}
}
impl std::fmt::Display for Table {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
writeln!(f, "Table: {}", self.name)?;
for (name, _) in self.headers.iter() {
write!(f, "{}, ", name)?;
}
writeln!(f)?;
for row in self.data.iter() {
for cell in row.iter() {
if let Some(&value) = cell.downcast_ref::<u32>() {
write!(f, "{}, ", value)?;
} else if let Some(value) = cell.downcast_ref::<String>() {
write!(f, "{}, ", value)?;
}
}
writeln!(f)?;
}
Ok(())
}
}
fn main() {
let mut table: Table = Default::default();
table.name = "Foo".to_owned();
table.add_header("key".to_owned(), TypeId::of::<u32>());
table.add_header("name".to_owned(), TypeId::of::<String>());
table.add_header("comment".to_owned(), TypeId::of::<String>());
let data = "\
key,name,comment
1,foo,foo comment
2,bar,bar comment
";
let mut rdr = Reader::from_reader(data.as_bytes());
table.populate_data(&mut rdr).unwrap();
print!("{}", table);
}

How to convert a Bytes Iterator into a Stream in Rust

I'm trying to figure out build a feature which requires reading the contents of a file into a futures::stream::BoxStream but I'm having a tough time figuring out what I need to do.
I have figured out how to read a file byte by byte via Bytes which implements an iterator.
use std::fs::File;
use std::io::prelude::*;
use std::io::{BufReader, Bytes};
// TODO: Convert this to a async Stream
fn async_read() -> Box<dyn Iterator<Item = Result<u8, std::io::Error>>> {
let f = File::open("/dev/random").expect("Could not open file");
let reader = BufReader::new(f);
let iter = reader.bytes().into_iter();
Box::new(iter)
}
fn main() {
ctrlc::set_handler(move || {
println!("received Ctrl+C!");
std::process::exit(0);
})
.expect("Error setting Ctrl-C handler");
for b in async_read().into_iter() {
println!("{:?}", b);
}
}
However, I've been struggling a bunch trying to figure out how I can turn this Box<dyn Iterator<Item = Result<u8, std::io::Error>>> into an Stream.
I would have thought something like this would work:
use futures::stream;
use std::fs::File;
use std::io::prelude::*;
use std::io::{BufReader, Bytes};
// TODO: Convert this to a async Stream
fn async_read() -> stream::BoxStream<'static, dyn Iterator<Item = Result<u8, std::io::Error>>> {
let f = File::open("/dev/random").expect("Could not open file");
let reader = BufReader::new(f);
let iter = reader.bytes().into_iter();
std::pin::Pin::new(Box::new(stream::iter(iter)))
}
fn main() {
ctrlc::set_handler(move || {
println!("received Ctrl+C!");
std::process::exit(0);
})
.expect("Error setting Ctrl-C handler");
while let Some(b) = async_read().poll() {
println!("{:?}", b);
}
}
But I keep getting a ton of compiler errors, I've tried other permutations but generally getting no where.
One of the compiler errors:
std::pin::Pin::new
``` --> src/main.rs:14:24
|
14 | std::pin::Pin::new(Box::new(stream::iter(iter)))
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected trait object `dyn std::iter::Iterator`, found enum `std::result::Result`
Anyone have any advice?
I'm pretty new to Rust, and specifically Streams/lower level stuff so I apologize if I got anything wrong, feel free to correct me.
For some additional background, I'm trying to do this so you can CTRL-C out of a command in nushell
I think you are overcomplicating it a bit, you can just return impl Stream from async_read, there is no need to box or pin (same goes for the original Iterator-based version). Then you need to set up an async runtime in order to poll the stream (in this example I just use the runtime provided by futures::executor::block_on). Then you can call futures::stream::StreamExt::next() on the stream to get a future representing the next item.
Here is one way to do this:
use futures::prelude::*;
use std::{
fs::File,
io::{prelude::*, BufReader},
};
fn async_read() -> impl Stream<Item = Result<u8, std::io::Error>> {
let f = File::open("/dev/random").expect("Could not open file");
let reader = BufReader::new(f);
stream::iter(reader.bytes())
}
async fn async_main() {
while let Some(b) = async_read().next().await {
println!("{:?}", b);
}
}
fn main() {
ctrlc::set_handler(move || {
println!("received Ctrl+C!");
std::process::exit(0);
})
.expect("Error setting Ctrl-C handler");
futures::executor::block_on(async_main());
}

Equivalent of inet_ntop in Rust

Is there a readily available way to convert ip addresses (both v4 and v6) from binary to text form in Rust (an equivalent to inet_ntop)?
Examples:
"3701A8C0" converts to "55.1.168.192",
"20010db8000000000000000000000001" converts to "2001:db8::1".
inet_ntop takes as input a struct in_addr* or a struct in6_addr*. The direct equivalent of those structs in Rust are Ipv4Addr and Ipv6Addr, both of which implement the Display trait and can therefore be formatted easily to text or printed:
let addr = Ipv4Addr::from (0x3701A8C0);
assert_eq!(format!("{}", addr), String::from ("55.1.168.192"));
println!("{}", addr);
AFAIK, there is not a direct conversion, but you can do that with from_str_radix, and then with the conversion of an ip from a numeric type:
use std::{
error::Error,
io,
net::{IpAddr, Ipv4Addr, Ipv6Addr},
str::FromStr,
};
fn convert(s: &str) -> io::Result<IpAddr> {
if let Ok(u) = u32::from_str_radix(s, 16) {
Ok(Ipv4Addr::from(u).into())
} else if let Ok(u) = u128::from_str_radix(s, 16) {
Ok(Ipv6Addr::from(u).into())
} else {
Err(io::Error::new(io::ErrorKind::InvalidData, "Invalid input"))
}
}
fn main() -> Result<(), Box<dyn Error>> {
let ip = convert("3701A8C0")?;
assert_eq!(ip, IpAddr::from_str("55.1.168.192")?);
let ip = convert("20010db8000000000000000000000001")?;
assert_eq!(ip, IpAddr::from_str("2001:db8::1")?);
Ok(())
}
If you already know that it is, for example, and IPV4, this is a one-liner:
use std::{
error::Error,
net::{IpAddr, Ipv4Addr},
str::FromStr,
};
fn main() -> Result<(), Box<dyn Error>> {
let ip = u32::from_str_radix("3701A8C0", 16).map(Ipv4Addr::from)?;
assert_eq!(ip, IpAddr::from_str("55.1.168.192")?);
Ok(())
}

How can I return the combination of two borrowed RefCells?

I have a struct with two Vecs wrapped in RefCells. I want to have a method on that struct that combines the two vectors and returns them as a new RefCell or RefMut:
use std::cell::{RefCell, RefMut};
struct World {
positions: RefCell<Vec<Option<Position>>>,
velocities: RefCell<Vec<Option<Velocity>>>,
}
type Position = i32;
type Velocity = i32;
impl World {
pub fn new() -> World {
World {
positions: RefCell::new(vec![Some(1), None, Some(2)]),
velocities: RefCell::new(vec![None, None, Some(1)]),
}
}
pub fn get_pos_vel(&self) -> RefMut<Vec<(Position, Velocity)>> {
let mut poses = self.positions.borrow_mut();
let mut vels = self.velocities.borrow_mut();
poses
.iter_mut()
.zip(vels.iter_mut())
.filter(|(e1, e2)| e1.is_some() && e2.is_some())
.map(|(e1, e2)| (e1.unwrap(), e2.unwrap()))
.for_each(|elem| println!("{:?}", elem));
}
}
fn main() {
let world = World::new();
world.get_pos_vel();
}
How would I return the zipped contents of the vectors as a new RefCell? Is that possible?
I know there is RefMut::map() and I tried to nest two calls to map, but didn't succeed with that.
You want to be able to modify the positions and velocities. If these have to be stored in two separate RefCells, what about side-stepping the problem and using a callback to do the modification?
use std::cell::RefCell;
struct World {
positions: RefCell<Vec<Option<Position>>>,
velocities: RefCell<Vec<Option<Velocity>>>,
}
type Position = i32;
type Velocity = i32;
impl World {
pub fn new() -> World {
World {
positions: RefCell::new(vec![Some(1), None, Some(2)]),
velocities: RefCell::new(vec![None, None, Some(1)]),
}
}
pub fn modify_pos_vel<F: FnMut(&mut Position, &mut Velocity)>(&self, mut f: F) {
let mut poses = self.positions.borrow_mut();
let mut vels = self.velocities.borrow_mut();
poses
.iter_mut()
.zip(vels.iter_mut())
.filter_map(|pair| match pair {
(Some(e1), Some(e2)) => Some((e1, e2)),
_ => None,
})
.for_each(|pair| f(pair.0, pair.1))
}
}
fn main() {
let world = World::new();
world.modify_pos_vel(|position, velocity| {
// Some modification goes here, for example:
*position += *velocity;
});
}
If you want to return a new Vec, then you don't need to wrap it in RefMut or RefCell:
Based on your code with filter and map
pub fn get_pos_vel(&self) -> Vec<(Position, Velocity)> {
let mut poses = self.positions.borrow_mut();
let mut vels = self.velocities.borrow_mut();
poses.iter_mut()
.zip(vels.iter_mut())
.filter(|(e1, e2)| e1.is_some() && e2.is_some())
.map(|(e1, e2)| (e1.unwrap(), e2.unwrap()))
.collect()
}
Alternative with filter_map
poses.iter_mut()
.zip(vels.iter_mut())
.filter_map(|pair| match pair {
(Some(e1), Some(e2)) => Some((*e1, *e2)),
_ => None,
})
.collect()
You can wrap it in RefCell with RefCell::new, if you really want to, but I would leave it up to the user of the function to wrap it in whatever they need.

Using the same iterator multiple times in Rust

Editor's note: This code example is from a version of Rust prior to 1.0 when many iterators implemented Copy. Updated versions of this code produce a different errors, but the answers still contain valuable information.
I'm trying to write a function to split a string into clumps of letters and numbers; for example, "test123test" would turn into [ "test", "123", "test" ]. Here's my attempt so far:
pub fn split(input: &str) -> Vec<String> {
let mut bits: Vec<String> = vec![];
let mut iter = input.chars().peekable();
loop {
match iter.peek() {
None => return bits,
Some(c) => if c.is_digit() {
bits.push(iter.take_while(|c| c.is_digit()).collect());
} else {
bits.push(iter.take_while(|c| !c.is_digit()).collect());
}
}
}
return bits;
}
However, this doesn't work, looping forever. It seems that it is using a clone of iter each time I call take_while, starting from the same position over and over again. I would like it to use the same iter each time, advancing the same iterator over all the each_times. Is this possible?
As you identified, each take_while call is duplicating iter, since take_while takes self and the Peekable chars iterator is Copy. (Only true before Rust 1.0 — editor)
You want to be modifying the iterator each time, that is, for take_while to be operating on an &mut to your iterator. Which is exactly what the .by_ref adaptor is for:
pub fn split(input: &str) -> Vec<String> {
let mut bits: Vec<String> = vec![];
let mut iter = input.chars().peekable();
loop {
match iter.peek().map(|c| *c) {
None => return bits,
Some(c) => if c.is_digit(10) {
bits.push(iter.by_ref().take_while(|c| c.is_digit(10)).collect());
} else {
bits.push(iter.by_ref().take_while(|c| !c.is_digit(10)).collect());
},
}
}
}
fn main() {
println!("{:?}", split("123abc456def"))
}
Prints
["123", "bc", "56", "ef"]
However, I imagine this is not correct.
I would actually recommend writing this as a normal for loop, using the char_indices iterator:
pub fn split(input: &str) -> Vec<String> {
let mut bits: Vec<String> = vec![];
if input.is_empty() {
return bits;
}
let mut is_digit = input.chars().next().unwrap().is_digit(10);
let mut start = 0;
for (i, c) in input.char_indices() {
let this_is_digit = c.is_digit(10);
if is_digit != this_is_digit {
bits.push(input[start..i].to_string());
is_digit = this_is_digit;
start = i;
}
}
bits.push(input[start..].to_string());
bits
}
This form also allows for doing this with much fewer allocations (that is, the Strings are not required), because each returned value is just a slice into the input, and we can use lifetimes to state this:
pub fn split<'a>(input: &'a str) -> Vec<&'a str> {
let mut bits = vec![];
if input.is_empty() {
return bits;
}
let mut is_digit = input.chars().next().unwrap().is_digit(10);
let mut start = 0;
for (i, c) in input.char_indices() {
let this_is_digit = c.is_digit(10);
if is_digit != this_is_digit {
bits.push(&input[start..i]);
is_digit = this_is_digit;
start = i;
}
}
bits.push(&input[start..]);
bits
}
All that changed was the type signature, removing the Vec<String> type hint and the .to_string calls.
One could even write an iterator like this, to avoid having to allocate the Vec. Something like fn split<'a>(input: &'a str) -> Splits<'a> { /* construct a Splits */ } where Splits is a struct that implements Iterator<&'a str>.
take_while takes self by value: it consumes the iterator. Before Rust 1.0 it also was unfortunately able to be implicitly copied, leading to the surprising behaviour that you are observing.
You cannot use take_while for what you are wanting for these reasons. You will need to manually unroll your take_while invocations.
Here is one of many possible ways of dealing with this:
pub fn split(input: &str) -> Vec<String> {
let mut bits: Vec<String> = vec![];
let mut iter = input.chars().peekable();
loop {
let seeking_digits = match iter.peek() {
None => return bits,
Some(c) => c.is_digit(10),
};
if seeking_digits {
bits.push(take_while(&mut iter, |c| c.is_digit(10)));
} else {
bits.push(take_while(&mut iter, |c| !c.is_digit(10)));
}
}
}
fn take_while<I, F>(iter: &mut std::iter::Peekable<I>, predicate: F) -> String
where
I: Iterator<Item = char>,
F: Fn(&char) -> bool,
{
let mut out = String::new();
loop {
match iter.peek() {
Some(c) if predicate(c) => out.push(*c),
_ => return out,
}
let _ = iter.next();
}
}
fn main() {
println!("{:?}", split("test123test"));
}
This yields a solution with two levels of looping; another valid approach would be to model it as a state machine one level deep only. Ask if you aren’t sure what I mean and I’ll demonstrate.

Resources