I currently deserialize a JSON array into Vec<String> and down stream in my application I convert individual String to SocketAddr.
I would like to do the deserialiation into Vec<SocketAddr> with serde instead.
use serde::Deserialize;
use std::net::SocketAddr;
#[derive(Debug, Deserialize)]
struct Doc {
// It would be nice to have Vec<SocketAddr> instead
hosts: Vec<String>
}
fn main(){
let data = r#"
{"hosts": ["localhost:8000","localhost:8001"]}
"#;
let doc: Doc = serde_json::from_str(data).unwrap();
dbg!(doc);
}
I'd like to disagree with the other two answers: You can absolutely deserialize strings like "localhost:80" to a (Vec of) SocketAddr. But you absolutely shouldn't. Let me explain:
Your problem is that SocketAddr only holds an IP address + port, not hostnames. You can solve this by resolving hostnames into SockAddr through ToSocketAddrs (and then flattening the result because one hostname can resolve to multiple addrs):
#[derive(Debug, Deserialize)]
struct Doc {
#[serde(deserialize_with = "flatten_resolve_addrs")]
hosts: Vec<SocketAddr>,
}
fn flatten_resolve_addrs<'de, D>(de: D) -> Result<Vec<SocketAddr>, D::Error>
where
D: Deserializer<'de>,
{
// Being a little lazy here about allocations and error handling.
// Because again, you shouldn't do this.
let unresolved = Vec::<String>::deserialize(de)?;
let mut resolved = vec![];
for a in unresolved {
let a = a
.to_socket_addrs()
.map_err(|e| serde::de::Error::custom(e))?;
for a in a {
resolved.push(a);
}
}
Ok(resolved)
}
Playground
You shouldn't do this because, deserialization should really round-trip through serialization, and be a pure function of the input bytes, but resolving addresses may access the network. Problems are:
What's the retry logic when resolution fails?
If your code is long running, the resolution result might change (from dns load balancing, dynamic dns, network config changes, …), but you can't re-resolve the addresses.
If your code is started as a system service, it might fail to start up with a deserialization error if the network isn't fully configured yet.
If a connection to one of the specified addresses fails, you have no way of printing a nice error message with the target hostname.
How would you configure a default port for this?
(Doing these things wrong is a pet peeve of mine, you'll find it in nginx, Kubernetes, Zookeeper, …)
Personally, I'd probably keep the Vec<String> for simplicity reasons, but you might also choose to do something like deserializing to Vec<(Either<IpAddr, Box<str>>, Option<u16>)> so you can check whether the strings are valid addresses, but do things like hostname resolution and providing a default port when you connect to those addresses.
Serde supports deserializing to SocketAddr, but your input isn't able to be parsed because localhost is not a valid IPv4 or IPv6 address.
The input needs to be a valid IPv4 or IPv6 address for it to deserialize directly into a SocketAddr. If you have control over your input data, this would work:
use serde::Deserialize;
use std::net::{IpAddr, Ipv4Addr, SocketAddr, SocketAddrV4, ToSocketAddrs};
#[derive(Debug, Deserialize)]
struct Doc {
hosts: Vec<SocketAddr>,
}
fn main() {
let data = r#"
{"hosts": ["127.0.0.1:8000","127.0.0.1:8001"]}
"#;
let doc: Doc = serde_json::from_str(data).unwrap();
dbg!(doc);
}
Here's the output:
[src/main.rs:15] doc = Doc {
hosts: [
127.0.0.1:8000,
127.0.0.1:8001,
],
}
You can deserialize into SocketAddr directly, no extra work required:
use serde::Deserialize;
use std::net::SocketAddr;
#[derive(Debug, Deserialize)]
struct Doc {
// It would be nice to have Vec<SocketAddr> instead
hosts: Vec<SocketAddr>,
}
fn main() {
let data = r#"
{"hosts": ["172.0.0.1:8000"]}
"#;
let doc: Doc = serde_json::from_str(data).unwrap();
dbg!(doc);
}
[src/main.rs:15] doc = Doc {
hosts: [
172.0.0.1:8000,
],
}
Your problem is that "localhost" is not a valid SocketAddr.
Related
Ok, so I'm very new to Rust and I'm trying to clumsily piece together a little CLI tool that makes http requests and handles the responses, by using tokio, clap, reqwest and serde.
The tool accepts a customer number as input and then it tries to fetch information about the customer. The customer may or may not have a FooBar in each country.
My code currently only works if I get a nice 200 response containing a FooBar. If I don't, the deserialization fails (naturally). (Edit: Actually, this initial assumption about the problem seems to be false, see comments below)
My aim is to only attempt the deserialization if I actually get a valid response.
How would I do that? I feel the need to see the code of a valid approach to understand this better.
Below is the entirety of my program.
use clap::Parser;
use reqwest::Response;
use serde::{Deserialize, Serialize};
#[tokio::main]
async fn main() -> Result<(), reqwest::Error> {
let args: Cli = Cli::parse();
let client = reqwest::Client::new();
let countries = vec!["FR", "GB", "DE", "US"];
for country in countries.iter() {
let foo_bar : FooBar = client.get(
format!("http://example-service.com/countries/{}/customers/{}/foo_bar", country, args.customer_number))
.send()
.await?
.json()
.await?;
println!("{}", foo_bar.a_value);
}
Ok(())
}
#[derive(Debug, Serialize, Deserialize)]
struct FooBar {
a_value: String,
}
#[derive(Parser, Debug)]
struct Cli {
customer_number: i32,
}
There are a few ways to approach this issue, first of all you can split the json() deserialization from send().await, i.e.:
for country in countries.iter() {
let resp: reqwest::Response = client.get(
format!("http://example-service.com/countries/{}/customers/{}/foo_bar", country, args.customer_number))
.send()
.await?;
if resp.status() != reqwest::StatusCode::OK {
eprintln!("didn't get OK status: {}", resp.status());
} else {
let foo_bar = resp.json().await?;
println!("{}", foo_bar.a_value);
}
}
If you want to keep the response body around, you can extract it through let bytes = resp.bytes().await?; and pass bytes to serde_json::from_slice(&*bytes) for the deserialization attempt.
This can be useful if you have a set of expected error response bodies.
I would like to parse the following part of a .yaml-file:
networks:
foo1: 192.168.1.0/24
bar1: 192.168.2.0/24
foo2: 2001:CAFE:1::/64
bar2: 2001:CAFE:2::/64
... where the key of each property is the labels the name of a network and the value indicates the assigned subnet to it.
The deserialized structs should look like the following:
#[derive(Debug, PartialEq, Serialize, Deserialize)]
struct Networks {
networks: Vec<Network>
}
#[derive(Debug, PartialEq, Serialize, Deserialize)]
struct Network {
name: String,
subnet: String,
}
I would prefer this notation over
networks:
- name: foo1
subnet: 192.168.1.0/24
- ...
since this would add unnecessary boilerplate. The problem is that I cannot properly parse this list of networks into structs, since to my current knowledge, the property of a struct has to have an equivalent key to it - and since the key-names here are "random" I cannot do that.
The only other workaround I've found is parsing the entries as HashMaps and automatically convert those into tuples of (String, String) (networks would then be a Vec<(String, String)>) via the serde-tuple-vec-map crate (which loses the named parameters).
This seems like something that would be easy to configure, yet I haven't found any solutions elsewhere.
Implementing a custom deserializer and using it with #[serde(deserialize_with = "network_tuples")] networks: Vec<Network> is one way of doing it:
fn network_tuples<'de, D>(des: D) -> Result<Vec<Network>, D::Error>
where
D: serde::Deserializer<'de>,
{
struct Vis(Vec<Network>);
impl<'de> serde::de::Visitor<'de> for Vis {
type Value = Vec<Network>;
fn expecting(&self, _formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
todo!("return nice descriptive error")
}
fn visit_map<A>(mut self, mut map: A) -> Result<Self::Value, A::Error>
where
A: serde::de::MapAccess<'de>,
{
while let Some((name, subnet)) = map.next_entry()? {
self.0.push(Network { name, subnet });
}
Ok(self.0)
}
}
des.deserialize_map(Vis(vec![]))
}
You could do without the visitor by first deserializing to a HashMap<String, String> and then converting, but that has a performance penalty and you'd lose the ability to have two networks with the same name (not sure if you want that though).
Playground
Of course, you could also combine my favorite serde trick #[serde(from = "SomeStructWithVecStringString")] with the serde-tuple-vec-map crate. But I've already written many answers using from, so I'll refer to an example.
I have a simple structure that contains a resources field. I would like my resources to always be Urls.
In the actual file that I am trying to deserialize, resources can either be URLs or Path.
Here is my structure:
pub struct Myfile {
pub resources: Vec<Resource>,
}
pub type Resource = Url;
I would like to use serde to:
Try to deserialize each Resource using the implementation from the url crate.
If it fails, try to deserialize each one of them into a Path and then use url::from_*_path() to get a Url.
I am trying to adapt the string or map,map and structure examples but I am struggling to understand where to even start.
Since my end result will by a Url, the examples seem to show that I should be implementing Deserialize for Url. But I still need to current implementation. My Resource is an alias so I can't implement Deserialize for it.
Is there any simple way to deserialize both Paths and Urls into Urls?
PS: I will probably go for the answer proposed but after reading through this post, I tried the following which seems to work too:
#[derive(Clone, Debug)]
pub struct Resource(Url);
impl<'de> Deserialize<'de> for Resource {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where D: Deserializer<'de>
{
let s = String::deserialize(deserializer)?;
if let path = Path::new(&s) {
if path.is_file() {
Ok(Resource(Url::from_file_path(path).unwrap()))
} else {
Ok(Resource(Url::from_directory_path(path).unwrap()))
}
} else {
Url::deserialize(s.into_deserializer()).map(|x| Resource(x))
}
}
}
But is less convenient to work with than regular Urls.
I think the key to doing this with reasonable effort is to realize that serde::Deserialize for Url is also just cooking with water, i.e. just expecting a string and calling Url::parse on it.
So it's time to deploy my favourite serde trick: deserialize to a struct that serde can handle easily:
#[derive(Deserialize)]
pub struct MyfileDeserialize {
pub resources: Vec<String>,
}
Tell serde that it should get the struct you finally want from that easily handlable struct:
#[derive(Deserialize, Debug)]
#[serde(try_from = "MyfileDeserialize")]
pub struct Myfile {
pub resources: Vec<Resource>,
}
Finally, you just need to define how to turn MyfileDeserialize into Myfile.
impl TryFrom<MyfileDeserialize> for Myfile {
type Error = &'static str; // TODO: replace with anyhow or a proper error type
fn try_from(t: MyfileDeserialize) -> Result<Self, &'static str> {
Ok(Self {
resources: t
.resources
.into_iter()
.map(|url| {
if let Ok(url) = Url::parse(&url) {
return Ok(url);
};
if let Ok(url) = Url::from_file_path(url) {
return Ok(url);
};
// try more from_*_path here.
Err("Can't as url or path")
})
.collect::<Result<Vec<_>, _>>()?,
})
}
}
Playground
Edit regarding your PS:
If you are willing to mess around with manually implementing deserializer traits and functions, I suppose you do have the option of completely getting rid of any wrapper structs or mediating TryFrom types: add a #[serde(deserialize_with = …)] to your resources, and in there, first do a <Vec<String>>::deserialize(de)?, and then turn that into a Vec<Url> as usual.
Playground
I am trying to work on a web app with Diesel and Rocket, by following the rocket guide. I have not able to understand how to do testing of this app.
//in main.rs
#[database("my_db")]
struct PgDbConn(diesel::PgConnection);
#[post("/users", format="application/json", data="<user_info>")]
fn create_new_user(conn: PgDbConn, user_info: Json<NewUser>) {
use schema::users;
diesel::insert_into(users::table).values(&*user_info).execute(&*conn).unwrap();
}
fn main() {
rocket::ignite()
.attach(PgDbConn::fairing())
.mount("/", routes![create_new_user])
.launch();
}
// in test.rs
use crate::PgDbConn;
#[test]
fn test_user_creation() {
let rocket = rocket::ignite().attach(PgDbConn::fairing());
let client = Client::new(rocket).unwrap();
let response = client
.post("/users")
.header(ContentType::JSON)
.body(r#"{"username": "xyz", "email": "temp#abc.com"}"#)
.dispatch();
assert_eq!(response.status(), Status::Ok);
}
But this modifies the database. How can I make sure that the test does not alter the database.
I tried to create two database and use them in the following way(I am not sure if this is recommended)
#[cfg(test)]
#[database("test_db")]
struct PgDbConn(diesel::PgConnection);
#[cfg(not(test))]
#[database("live_db")]
struct PgDbConn(diesel::PgConnection);
Now I thought I can use the test_transaction method of the diesel::connection::Connection trait in the following way:-
use crate::PgDbConn;
#[test]
fn test_user_creation() {
// !!This statment is wrong as PgDbConn is an Fn object instead of a struct
// !!I am not sure how it works but it seems that this Fn object is resolved
// !!into struct only when used as a Request Guard
let conn = PgDbConn;
// Deref trait for PgDbConn is implemented, So I thought that dereferencing
// it will return a diesel::PgConnection
(*conn).test_transaction::<_, (), _>(|| {
let rocket = rocket::ignite().attach(PgDbConn::fairing());
let client = Client::new(rocket).unwrap();
let response = client
.post("/users")
.header(ContentType::JSON)
.body(r#"{"username": "Tushar", "email": "temp#abc.com"}"#)
.dispatch();
assert_eq!(response.status(), Status::Ok);
Ok(())
});
}
The above code obviously fails to compile. Is there a way to resolve this Fn object into the struct and obtain the PgConnection in it. And I am not even sure if this is the right to way to do things.
Is there a recommended way to do testing while using both Rocket and Diesel?
This will fundamentally not work as you imagined there, as conn will be a different connection than whatever rocket generates for you. The test_transaction pattern assumes that you use the same connection for everything.
I want to convert multiple env.variables to static struct.
I can do it mannually:
Env {
is_development: env::var("IS_DEVELOPMENT")
.unwrap()
.parse::<bool>()
.unwrap(),
server: Server {
host: env::var("HOST").unwrap(),
port: env::var("PORT")
.unwrap()
.parse::<u16>()
.unwrap(),
},
}
But when there is multiple values, it's became bloated. Is there a way to make generic helper function that will give me value that i specify or panic? Something like this (or another solution):
fn get_env_var<T>(env_var_name: String) -> T {
// panic is ok here
let var = env::var(env_var_name).unwrap();
T::from(var)
}
get_env_var<u16>("PORT") // here i got u16
get_env_var<bool>("IS_DEVELOPMENT") // here is my boolean
Full example
use crate::server::logger::log_raw;
use dotenv::dotenv;
use serde::Deserialize;
use std::env;
#[derive(Deserialize, Debug, Clone)]
pub struct Server {
pub host: String,
pub port: u16,
}
#[derive(Deserialize, Debug, Clone)]
pub struct Env {
pub is_development: bool,
pub server: Server,
}
impl Env {
pub fn init() -> Self {
dotenv().expect(".env loading fail");
// how can i specify what type i expect?
fn get_env_var<T>(env_var_name: String) -> T {
// panic is ok here
let var = env::var(env_var_name).unwrap();
T::from(var)
}
// instead this
Env {
is_development: env::var("IS_DEVELOPMENT")
.unwrap()
.parse::<bool>()
.unwrap(),
server: Server {
host: env::var("HOST").unwrap(),
port: env::var("PORT")
.unwrap()
.parse::<u16>()
.unwrap(),
},
}
// do something like this
/*
Env {
is_development: get_env_var<bool>("IS_DEVELOPMENT"),
server: Server {
host: get_env_var<String>("HOST"),
port: get_env_var<u16>("PORT"),
},
}
*/
}
}
lazy_static! {
pub static ref ENV: Env = Env::init();
}
Like in your manual version, where you use str::parse, you can have the same requirement as str::parse, which is FromStr. So if you include the T: FromStr requirement, then you'll be able to do var.parse::<T>().
use std::env;
use std::fmt::Debug;
use std::str::FromStr;
fn get_env_var<T>(env_var_name: &str) -> T
where
T: FromStr,
T::Err: Debug,
{
let var = env::var(env_var_name).unwrap();
var.parse::<T>().unwrap()
}
Then if you run the following by executing PORT=1234 IS_DEVELOPMENT=true cargo run.
fn main() {
println!("{}", get_env_var::<u16>("PORT"));
println!("{}", get_env_var::<bool>("IS_DEVELOPMENT"));
}
Then it will output:
1234
true
Alternatively, you might want to be able to handle VarError::NotPresent and fallback to a default.
use std::env::{self, VarError};
use std::fmt::Debug;
use std::str::FromStr;
fn get_env_var<T>(env_var_name: &str) -> Result<T, VarError>
where
T: FromStr,
T::Err: Debug,
{
let var = env::var(env_var_name)?;
Ok(var.parse().unwrap())
}
Now if you only executed PORT=1234 cargo run, then it would make it easier to do this:
let is_dev = get_env_var::<bool>("IS_DEVELOPMENT")
.map_err(|err| match err {
VarError::NotPresent => Ok(false),
err => Err(err),
})
.unwrap();
println!("{:?}", is_dev);
If you want to fallback to Default if VarError::NotPresent:
fn get_env_var<T>(env_var_name: &str) -> T
where
T: FromStr,
T::Err: Debug,
T: Default,
{
let var = match env::var(env_var_name) {
Err(VarError::NotPresent) => return T::default(),
res => res.unwrap(),
};
var.parse().unwrap()
}
Rust genericity, inspired by Haskell's works through traits and specifically trait bounds. This means when you write
fn get_env_var<T>(env_var_name: String) -> T
since there is no trait bound on T there are essentially no capabilities for it (this is rather unlike C++).
Therefore, as far as rustc is concerned, pretty much the only thing it can do with a T is... take one as parameter then return it as-is.
Thus to do anything useful with a T (including creating one, whether from something else or de novo) you need to use the correct trait and provide the correct trait bounds.
The From trait is entirely the wrong trait to involve here: it specifies total (never-failing) conversions e.g. converting a u16 to a u32, which can never fail.
Whether it's converting a String to a bool or a u16, the conversion is quite obviously less than total: there is an infinity of string values which are not sequences of decimal digits describing a number below 2^16.
In Rust, the signifier of failabibility tends to be Try. There is a TryFrom trait, however for historical reasons and as it documents in its signature the str::parse method is hooked on the FromStr trait.
This means in order to declare that your T can be created from a string (and use the parse method to create one), you need to bound T to FromStr. And of course indicate that it may fail, and will return whatever error T generates when it can't be parsed from a string:
fn get_env_var<T: FromStr>(env_var_name: String) -> Result<T, T::Err> {
let var = env::var(env_var_name).unwrap();
var.parse()
}
Incidentally, taking a String as input is usually avoided unless you really have to[0]. Usually you'd take an &str, that's a lot more flexible as it can be used e.g. with string literals (which are of type &'static str).
So
fn get_env_var<T: FromStr>(env_var_name: &str) -> Result<T, T::Err> {
let var = env::var(env_var_name).unwrap();
var.parse()
}
[0] or for efficiency purposes sometimes