Deserialize JSON list of hex strings as bytes - rust

Iʼm trying to read a JSON stream, part of which looks like
"data": [
"c1a8f800a4393e0cacd05a5bc60ae3e0",
"bbac4013c1ca3482155b584d35dac185",
"685f237d4fcbd191c981b94ef6986cde",
"a08898e81f1ddb6612aa12641b856aa9"
]
(there are more entries in the data list and each each entry is longer, but this should be illustrative; both the length of the list and the length of each hex string is known at compile time)
Ideally Iʼd want a single [u8; 64] (the actual size is known at compile time), or failing that, a Vec<u8>, but I imagine itʼs easier to deseriazie it as a Vec<[u8; 16]> and merge them later. However, Iʼm having trouble doing even that.
The hex crate has a way to deserialize a single hex string as a Vec or array of u8, but I canʼt figure out how to tell Serde to do that for each entry of the list. Is there a simple way to do that Iʼm overlooking, or do I need to write my own list deserializer?

Serde has the power to use serializers and deserializers from other crates in a nested fashion using #[serde(with = "...")]. Since hex has a serde feature, this can be easily done.
Here is a simple example using serde_json and hex.
cargo.toml
serde = { version = "1.0.133", features = ["derive"] }
serde_json = "1.0.74"
hex = { version = "0.4", features = ["serde"] }
main.rs
use serde::{Deserialize, Serialize};
use serde_json::Result;
#[derive(Serialize, Deserialize, Debug)]
struct MyData {
data: Vec<MyHex>,
}
#[derive(Serialize, Deserialize, Debug)]
#[serde(transparent)]
struct MyHex {
#[serde(with = "hex::serde")]
hex: Vec<u8>,
}
fn main() -> Result<()> {
let data = r#"
{
"data": [
"c1a8f800a4393e0cacd05a5bc60ae3e0",
"bbac4013c1ca3482155b584d35dac185",
"685f237d4fcbd191c981b94ef6986cde",
"a08898e81f1ddb6612aa12641b856aa9"
]
}
"#';
let my_data: MyData = serde_json::from_str(data)?;
println!("{:?}", my_data); // MyData { data: [MyHex { hex: [193, 168, 248, 0, 164, 57, 62, 12, 172, 208, 90, 91, 198, 10, 227, 224] }, MyHex { hex: [187, 172, 64, 19, 193, 202, 52, 130, 21, 91, 88, 77, 53, 218, 193, 133] }, MyHex { hex: [104, 95, 35, 125, 79, 203, 209, 145, 201, 129, 185, 78, 246, 152, 108, 222] }, MyHex { hex: [160, 136, 152, 232, 31, 29, 219, 102, 18, 170, 18, 100, 27, 133, 106, 169] }] }
return Ok(());
}
Serde With Reference
Hex Serde Reference

In some performance-critical situations, it may be advantageous to implement your own deserializer and use it with serde(deserialize_with = …).
If you go that route, you have to:
Implement a deserialziation function for data
Implement a visitor which takes a sequence of precisely 4 blocks
These blocks then need another deserialization function that turns a string into [u8; 16]
use serde::{Deserialize, Deserializer};
#[derive(Deserialize, Debug)]
pub struct Foo {
#[serde(deserialize_with = "deserialize_array_of_hex")]
pub data: [u8; 64],
}
fn deserialize_array_of_hex<'de, D: Deserializer<'de>>(d: D) -> Result<[u8; 64], D::Error> {
use serde::de;
use std::fmt;
#[derive(serde_derive::Deserialize)]
struct Block(#[serde(with="hex::serde")] [u8; 16]);
struct VecVisitor;
impl<'de> de::Visitor<'de> for VecVisitor {
type Value = [u8; 64];
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
write!(formatter, "a list containing 4 hex strings")
}
fn visit_seq<A>(self, mut seq: A) -> Result<Self::Value, A::Error>
where
A: de::SeqAccess<'de>,
{
let mut data = [0; 64];
for i in 0..4 {
let block = seq
.next_element::<Block>()?
.ok_or_else(|| de::Error::custom("too short"))?;
for j in 0..16 {
data[i * 16 + j] = block.0[j];
}
}
if seq.next_element::<String>()?.is_some() {
return Err(de::Error::custom("too long"))
}
Ok(data)
}
}
d.deserialize_seq(VecVisitor)
}
Full example playground. One could also implement DeserializeSeed for Block and only pass a reference to the [u8; 64] to be written into, but I suspect that copying 16 bytes is negligibly cheap. (Edit: I measured it, it turns out about 10% faster than the other two solutions in this post (when using hex::decode_to_slice in visit_str).)
Actually, nevermind having to implement your own deserializer for performance, the above solution is about equal in performance to
use serde::Deserialize;
#[derive(Deserialize, Debug)]
#[serde(from = "MyDataPre")]
pub struct MyData {
pub data: [u8; 64],
}
impl From<MyDataPre> for MyData {
fn from(p: MyDataPre) -> Self {
let mut data = [0; 64];
for b in 0..4 {
for j in 0..16 {
data[b * 16 + j] = p.data[b].0[j];
}
}
MyData { data }
}
}
#[derive(Deserialize, Debug)]
pub struct MyDataPre {
data: [MyHex; 4],
}
#[derive(Deserialize, Debug)]
struct MyHex (#[serde(with = "hex::serde")] [u8; 16]);
The trick here is the use of #[serde(from = …)], which allows you to deserialize to some other struct, and then tell serde how to convert that to the struct you originally wanted.

Related

Rust Unique Return Counts / Unique with Frequency

What is the fastest way to to get the unique elements in a vector and their count? Similar to numpy.unique(return_counts=True). The below becomes exceedingly slow as the array grows into the millions.
use std::collections::HashMap;
use itertools::Itertools;
fn main () {
let kmers: Vec<u8> = vec![64, 64, 64, 65, 65, 65];
let nodes: HashMap<u8, usize> = kmers
.iter()
.unique()
.map(|kmer| {
let count = kmers.iter().filter(|x| x == &kmer).count();
(kmer.to_owned(), count)
})
.collect();
println!("{:?}", nodes)
}
You can use the entry API for this. The linked docs have a similar example to what you need, here it is modified to fit your case:
use std::collections::HashMap;
fn main () {
let kmers: Vec<u8> = vec![64, 64, 64, 65, 65, 65];
let mut nodes: HashMap<u8, usize> = HashMap::new();
for n in kmers.iter() {
nodes.entry(*n).and_modify(|count| *count += 1).or_insert(1);
}
println!("{:?}", nodes)
}
playground
If you want the output to be sorted, you can use a BTreeMap instead.
If you prefer a one-liner, you can use itertools' counts(): (this uses the same code as in #PitaJ answer under the hood, with a little improvement):
use std::collections::HashMap;
use itertools::Itertools;
fn main () {
let kmers: Vec<u8> = vec![64, 64, 64, 65, 65, 65];
let nodes: HashMap<u8, usize> = kmers.iter().copied().counts();
println!("{:?}", nodes)
}

How to parse nested struct member variables in rust proc_macro?

I am attempting to write a macro that will generate a telemetry function for any struct by using
#[derive(telemetry)] . This function will send a data stream to anything provided that is io::Writable. This data stream will be "self describing" such that the receiver doesn't need to know anything else about the data other than the bytes received. This allows the receiver to be able to correctly parse a struct and print its member variables names and values, even if variables are added, removed, order changed, or variable names renamed. The telemetry function works for non-nested structs and will print the name and type of a nested struct. But I need it to recursively print the names, types, sizes, and values of the nested structs member variables. An example is shown below as is the code.
Current behavior
use derive_telemetry::Telemetry;
use std::fs::File;
use std::io::{Write};
use std::time::{Duration, Instant};
use std::marker::Send;
use std::sync::{Arc, Mutex};
use serde::{Deserialize, Serialize};
#[repr(C, packed)]
#[derive(Debug, Serialize, Deserialize, Copy, Clone)]
pub struct AnotherCustomStruct {
pub my_var_2: f64,
pub my_var_1: f32,
}
#[derive(Telemetry)]
#[derive(Debug, Serialize, Deserialize)]
struct TestStruct {
pub a: u32,
pub b: u32,
pub my_custom_struct: AnotherCustomStruct,
pub my_array: [u32; 10],
pub my_vec: Vec::<u64>,
pub another_variable : String,
}
const HEADER_FILENAME: &str = "test_file_stream.header";
const DATA_FILENAME: &str = "test_file_stream.data";
fn main() -> Result<(), Box<dyn std::error::Error>>{
let my_struct = TestStruct { a: 10,
b: 11,
my_custom_struct: AnotherCustomStruct { my_var_1: 123.456, my_var_2: 789.1023 },
my_array: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
my_vec: vec![11, 12, 13],
another_variable: "Hello".to_string()
};
let file_header_stream = Mutex::new(Box::new(File::create(HEADER_FILENAME)?) as Box <dyn std::io::Write + Send + Sync>);
let file_data_stream = Mutex::new(Box::new(File::create(DATA_FILENAME)?) as Box <dyn std::io::Write + Send + Sync>);
my_struct.telemetry(Arc::new(file_header_stream), Arc::new(file_data_stream));
let header: TelemetryHeader = bincode::deserialize_from(&File::open(HEADER_FILENAME)?)?;
let data: TestStruct = bincode::deserialize_from(&File::open(DATA_FILENAME)?)?;
println!("{:#?}", header);
println!("{:?}", data);
Ok(())
}
produces
TelemetryHeader {
variable_descriptions: [
VariableDescription {
var_name_length: 1,
var_name: "a",
var_type_length: 3,
var_type: "u32",
var_size: 4,
},
VariableDescription {
var_name_length: 1,
var_name: "b",
var_type_length: 3,
var_type: "u32",
var_size: 4,
},
VariableDescription {
var_name_length: 16,
var_name: "my_custom_struct",
var_type_length: 19,
var_type: "AnotherCustomStruct",
var_size: 12,
},
VariableDescription {
var_name_length: 8,
var_name: "my_array",
var_type_length: 10,
var_type: "[u32 ; 10]",
var_size: 40,
},
VariableDescription {
var_name_length: 6,
var_name: "my_vec",
var_type_length: 14,
var_type: "Vec :: < u64 >",
var_size: 24,
},
VariableDescription {
var_name_length: 16,
var_name: "another_variable",
var_type_length: 6,
var_type: "String",
var_size: 24,
},
],
}
TestStruct { a: 10, b: 11, my_custom_struct: AnotherCustomStruct { my_var_2: 789.1023, my_var_1: 123.456 }, my_array: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], my_vec: [11, 12, 13], another_variable: "Hello" }
The data format is length of variable name, variable name, length of variable type, variable type, variable num of bytes.
Required Behavior
TelemetryHeader {
variable_descriptions: [
VariableDescription {
var_name_length: 1,
var_name: "a",
var_type_length: 3,
var_type: "u32",
var_size: 4,
},
VariableDescription {
var_name_length: 1,
var_name: "b",
var_type_length: 3,
var_type: "u32",
var_size: 4,
},
VariableDescription {
var_name_length: 16,
var_name: "my_custom_struct",
var_type_length: 19,
var_type: "AnotherCustomStruct",
var_size: 12,
},
VariableDescription {
var_name_length: 8,
var_name: "my_var_2",
var_type_length: 3,
var_type: "f64",
var_size: 8,
},
VariableDescription {
var_name_length: 8,
var_name: "my_var_1",
var_type_length: 3,
var_type: "f32",
var_size: 4,
},
VariableDescription {
var_name_length: 8,
var_name: "my_array",
var_type_length: 10,
var_type: "[u32 ; 10]",
var_size: 40,
},
VariableDescription {
var_name_length: 6,
var_name: "my_vec",
var_type_length: 14,
var_type: "Vec :: < u64 >",
var_size: 24,
},
VariableDescription {
var_name_length: 16,
var_name: "another_variable",
var_type_length: 6,
var_type: "String",
var_size: 24,
},
],
}
TestStruct { a: 10, b: 11, my_custom_struct: AnotherCustomStruct { my_var_2: 789.1023, my_var_1: 123.456 }, my_array: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], my_vec: [11, 12, 13], another_variable: "Hello" }
To reiterate, the current behavior correctly prints the variable name, type, size, and value for a struct with the Telemetry trait derived. It prints the name, type, and size for a nested struct correctly, but it does not then print the names and values of the nested structs members which is required. The code is below, apologies for such a long post, I hope this is formatted well and clear, thank you in advance.
Directory Structure
src
-main.rs
telemetry
-Cargo.toml
-src
--lib.rs
Cargo.toml
main.rs
use derive_telemetry::Telemetry;
use std::fs::File;
use std::io::{Write};
use std::time::{Duration, Instant};
use std::marker::Send;
use std::sync::{Arc, Mutex};
use serde::{Deserialize, Serialize};
const HEADER_FILENAME: &str = "test_file_stream.header";
const DATA_FILENAME: &str = "test_file_stream.data";
#[repr(C, packed)]
#[derive(Debug, Serialize, Deserialize, Copy, Clone)]
pub struct AnotherCustomStruct {
pub my_var_2: f64,
pub my_var_1: f32,
}
#[derive(Telemetry)]
#[derive(Debug, Serialize, Deserialize)]
struct TestStruct {
pub a: u32,
pub b: u32,
pub my_custom_struct: AnotherCustomStruct,
pub my_array: [u32; 10],
pub my_vec: Vec::<u64>,
pub another_variable : String,
}
fn main() -> Result<(), Box<dyn std::error::Error>>{
let my_struct = TestStruct { a: 10,
b: 11,
my_custom_struct: AnotherCustomStruct { my_var_1: 123.456, my_var_2: 789.1023 },
my_array: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
my_vec: vec![11, 12, 13],
another_variable: "Hello".to_string()
};
let file_header_stream = Mutex::new(Box::new(File::create(HEADER_FILENAME)?) as Box <dyn std::io::Write + Send + Sync>);
let file_data_stream = Mutex::new(Box::new(File::create(DATA_FILENAME)?) as Box <dyn std::io::Write + Send + Sync>);
//let stdout_header_stream = Mutex::new(Box::new(io::stdout()) as Box <dyn std::io::Write + Send + Sync>);
//let stdout_data_stream = Mutex::new(Box::new(io::stdout()) as Box <dyn std::io::Write + Send + Sync>);
//let tcp_header_stream = Mutex::new(Box::new(TCPStream::connect(127.0.0.1)?) as Box <dyn std::io::Write + Send + Sync>);
//let tcp_data_stream = Mutex::new(Box::new(TCPStream::connect(127.0.0.1)?) as Box <dyn std::io::Write + Send + Sync>);
//let test_traits = Mutex::new(Box::new(io::stdout()) as Box <dyn std::io::Write + Send + Sync>);
let start = Instant::now();
my_struct.telemetry(Arc::new(file_header_stream), Arc::new(file_data_stream));
let duration = start.elapsed();
println!("Telemetry took: {:?}", duration);
thread::sleep(Duration::from_secs(1));
let header: TelemetryHeader = bincode::deserialize_from(&File::open(HEADER_FILENAME)?)?;
let data: TestStruct = bincode::deserialize_from(&File::open(DATA_FILENAME)?)?;
println!("{:#?}", header);
println!("{:?}", data);
Ok(())
}
main Cargo.toml
[package]
name = "proc_macro_test"
version = "0.1.0"
edition = "2018"
[workspace]
members = [
"telemetry",
]
[dependencies]
derive_telemetry = { path = "telemetry" }
ndarray = "0.15.3"
crossbeam = "*"
serde = { version = "*", features=["derive"]}
bincode = "*"
[profile.dev]
opt-level = 0
[profile.release]
opt-level = 3
telemetry lib.rs
use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, parse_quote, DeriveInput};
#[proc_macro_derive(Telemetry)]
pub fn derive(input: TokenStream) -> TokenStream {
let input = parse_macro_input!(input as DeriveInput);
let output = parse_derive_input(&input);
match output {
syn::Result::Ok(tt) => tt,
syn::Result::Err(err) => err.to_compile_error(),
}
.into()
}
fn parse_derive_input(input: &syn::DeriveInput) -> syn::Result<proc_macro2::TokenStream> {
let struct_ident = &input.ident;
let struct_data = parse_data(&input.data)?;
let struct_fields = &struct_data.fields;
let generics = add_debug_bound(struct_fields, input.generics.clone());
let (impl_generics, ty_generics, where_clause) = generics.split_for_impl();
let _struct_ident_str = format!("{}", struct_ident);
let tele_body = match struct_fields {
syn::Fields::Named(fields_named) => handle_named_fields(fields_named)?,
syn::Fields::Unnamed(fields_unnamed) => {
let field_indexes = (0..fields_unnamed.unnamed.len()).map(syn::Index::from);
let field_indexes_str = (0..fields_unnamed.unnamed.len()).map(|idx| format!("{}", idx));
quote!(#( .field(#field_indexes_str, &self.#field_indexes) )*)
}
syn::Fields::Unit => quote!(),
};
let telemetry_declaration = quote!(
trait Telemetry {
fn telemetry(self, header_stream: Arc<Mutex::<Box <std::io::Write + std::marker::Send + Sync>>>, data_stream: Arc<Mutex::<Box <std::io::Write + std::marker::Send + Sync>>>);
}
);
syn::Result::Ok(
quote!(
use std::thread;
use std::collections::VecDeque;
#[derive(Serialize, Deserialize, Default, Debug)]
pub struct VariableDescription {
pub var_name_length: usize,
pub var_name: String,
pub var_type_length: usize,
pub var_type: String,
pub var_size: usize,
}
#[derive(Serialize, Deserialize, Default, Debug)]
pub struct TelemetryHeader {
pub variable_descriptions: VecDeque::<VariableDescription>,
}
#telemetry_declaration
impl #impl_generics Telemetry for #struct_ident #ty_generics #where_clause {
fn telemetry(self, header_stream: Arc<Mutex::<Box <std::io::Write + std::marker::Send + Sync>>>, data_stream: Arc<Mutex::<Box <std::io::Write + std::marker::Send + Sync>>>) {
thread::spawn(move || {
#tele_body;
});
}
}
)
)
}
fn handle_named_fields(fields: &syn::FieldsNamed) -> syn::Result<proc_macro2::TokenStream> {
let idents = fields.named.iter().map(|f| &f.ident);
let types = fields.named.iter().map(|f| &f.ty);
let num_entities = fields.named.len();
let test = quote! (
let mut tele_header = TelemetryHeader {variable_descriptions: VecDeque::with_capacity(#num_entities)};
#(
tele_header.variable_descriptions.push_back( VariableDescription {
var_name_length: stringify!(#idents).len(),
var_name: stringify!(#idents).to_string(),
var_type_length: stringify!(#types).len(),
var_type: stringify!(#types).to_string(),
var_size: std::mem::size_of_val(&self.#idents),
});
)*
header_stream.lock().unwrap().write(&bincode::serialize(&tele_header).unwrap()).unwrap();
data_stream.lock().unwrap().write(&bincode::serialize(&self).unwrap()).unwrap();
);
syn::Result::Ok(test)
}
fn parse_named_field(field: &syn::Field) -> proc_macro2::TokenStream {
let ident = field.ident.as_ref().unwrap();
let ident_str = format!("{}", ident);
let ident_type = &field.ty;
if field.attrs.is_empty() {
quote!(
println!("Var Name Length: {}", stringify!(#ident_str).len());
println!("Var Name: {}", #ident_str);
println!("Var Type Length: {}", stringify!(#ident_type).len());
println!("Var Type: {}", stringify!(#ident_type));
println!("Var Val: {}", &self.#ident);
)
}
else {
//parse_named_field_attrs(field)
quote!()
}
}
fn parse_named_field_attrs(field: &syn::Field) -> syn::Result<proc_macro2::TokenStream> {
let ident = field.ident.as_ref().unwrap();
let ident_str = format!("{}", ident);
let attr = field.attrs.last().unwrap();
if !attr.path.is_ident("debug") {
return syn::Result::Err(syn::Error::new_spanned(
&attr.path,
"value must be \"debug\"",
));
}
let attr_meta = &attr.parse_meta();
match attr_meta {
Ok(syn::Meta::NameValue(syn::MetaNameValue { lit, .. })) => {
let debug_assign_value = lit;
syn::Result::Ok(quote!(
.field(#ident_str, &std::format_args!(#debug_assign_value, &self.#ident))
))
}
Ok(meta) => syn::Result::Err(syn::Error::new_spanned(meta, "expected meta name value")),
Err(err) => syn::Result::Err(err.clone()),
}
}
fn parse_data(data: &syn::Data) -> syn::Result<&syn::DataStruct> {
match data {
syn::Data::Struct(data_struct) => syn::Result::Ok(data_struct),
syn::Data::Enum(syn::DataEnum { enum_token, .. }) => syn::Result::Err(
syn::Error::new_spanned(enum_token, "CustomDebug is not implemented for enums"),
),
syn::Data::Union(syn::DataUnion { union_token, .. }) => syn::Result::Err(
syn::Error::new_spanned(union_token, "CustomDebug is not implemented for unions"),
),
}
}
fn add_debug_bound(fields: &syn::Fields, mut generics: syn::Generics) -> syn::Generics {
let mut phantom_ty_idents = std::collections::HashSet::new();
let mut non_phantom_ty_idents = std::collections::HashSet::new();
let g = generics.clone();
for (ident, opt_iter) in fields
.iter()
.flat_map(extract_ty_path)
.map(|path| extract_ty_idents(path, g.params.iter().flat_map(|p| {
if let syn::GenericParam::Type(ty) = p {
std::option::Option::Some(&ty.ident)
} else {
std::option::Option::None
}
} ).collect()))
{
if ident == "PhantomData" {
// If the field type ident is `PhantomData`
// add the generic parameters into the phantom idents collection
if let std::option::Option::Some(args) = opt_iter {
for arg in args {
phantom_ty_idents.insert(arg);
}
}
} else {
// Else, add the type and existing generic parameters into the non-phantom idents collection
non_phantom_ty_idents.insert(ident);
if let std::option::Option::Some(args) = opt_iter {
for arg in args {
non_phantom_ty_idents.insert(arg);
}
}
}
}
// Find the difference between the phantom idents and non-phantom idents
// Collect them into an hash set for O(1) lookup
let non_debug_fields = phantom_ty_idents
.difference(&non_phantom_ty_idents)
.collect::<std::collections::HashSet<_>>();
// Iterate generic params and if their ident is NOT in the phantom fields
// do not add the generic bound
for param in generics.type_params_mut() {
// this is kinda shady, hoping it works
if !non_debug_fields.contains(&&param.ident) {
param.bounds.push(parse_quote!(std::fmt::Debug));
}
}
generics
}
/// Extract the path from the type path in a field.
fn extract_ty_path(field: &syn::Field) -> std::option::Option<&syn::Path> {
if let syn::Type::Path(syn::TypePath { path, .. }) = &field.ty {
std::option::Option::Some(&path)
} else {
std::option::Option::None
}
}
/// From a `syn::Path` extract both the type ident and an iterator over generic type arguments.
fn extract_ty_idents<'a>(
path: &'a syn::Path,
params: std::collections::HashSet<&'a syn::Ident>,
) -> (
&'a syn::Ident,
std::option::Option<impl Iterator<Item = &'a syn::Ident>>,
) {
let ty_segment = path.segments.last().unwrap();
let ty_ident = &ty_segment.ident;
if let syn::PathArguments::AngleBracketed(syn::AngleBracketedGenericArguments {
args, ..
}) = &ty_segment.arguments
{
let ident_iter = args.iter().flat_map(move |gen_arg| {
if let syn::GenericArgument::Type(syn::Type::Path(syn::TypePath { path, .. })) = gen_arg
{
match path.segments.len() {
2 => {
let ty = path.segments.first().unwrap();
let assoc_ty = path.segments.last().unwrap();
if params.contains(&ty.ident) {
std::option::Option::Some(&assoc_ty.ident)
} else {
std::option::Option::None
}
}
1 => path.get_ident(),
_ => std::unimplemented!("kinda tired of edge cases"),
}
} else {
std::option::Option::None
}
});
(ty_ident, std::option::Option::Some(ident_iter))
} else {
(ty_ident, std::option::Option::None)
}
}
#[cfg(test)]
mod tests {
#[test]
fn it_works() {
assert_eq!(2 + 2, 4);
}
}
telemetry Cargo.toml
[package]
name = "derive_telemetry"
version = "0.0.0"
edition = "2018"
autotests = false
publish = false
[lib]
proc-macro = true
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
proc-macro2 = ">= 1.0.29"
syn = ">= 1.0.76"
quote = ">= 1.0.9"
crossbeam = "*"
serde = { version = "*", features=["derive"]}
bincode = "*"
Once again, apologies for the lengthy post. I hope this is clear and I believe this is everything to reproduce what I have and unfortunately I believe it is a minimum working example otherwise I would take more out for ease of reading/answering
I think if I could call handle_named_fields recursively on any Field that is a struct I would get the desired behavior. The only problem with this approach is that I don't see an obvious way to tell if a Field is a struct or not. If I had a syn::Data it would be trivial, but I can't see how to make to this with a syn::Field.
What you're trying is maybe a bit complicated for a SO question. I can only give an answer sketch (which would usually be a comment, but it's too long for that).
You can't parse "nested" structs in one macro run. Your derive macro gets access to the one struct it's running on, that's it. The only thing you can do is have the generated code "recursively" call other generated code (i.e. TestStruct::telemetry could call AnotherCustomStruct::telemetry).
There is one fundamental problem, though: Your macro won't know which struct members to generate a recursive call for. To solve that, you can either do the recursive call on all members, and implement Telemetry for a buttload of existing types, or you ask the user to add some #[recursive_telemetry] attribute to struct members they want included, and fire only on that.

How to pad an array with zeros?

fn main() {
let arr: [u8;8] = [97, 112, 112, 108, 101];
println!("Len is {}",arr.len());
println!("Elements are {:?}",arr);
}
error[E0308]: mismatched types
--> src/main.rs:2:23
|
2 | let arr: [u8;8] = [97, 112, 112, 108, 101];
| ------ ^^^^^^^^^^^^^^^^^^^^^^^^ expected an array with a fixed size of 8 elements, found one with 5 elements
| |
| expected due to this
Is there any way to pad the remaining elements with 0's? Something like:
let arr: [u8;8] = [97, 112, 112, 108, 101].something();
In addition to the other answers, you can use const generics to write a dedicated method.
fn pad_zeroes<const A: usize, const B: usize>(arr: [u8; A]) -> [u8; B] {
assert!(B >= A); //just for a nicer error message, adding #[track_caller] to the function may also be desirable
let mut b = [0; B];
b[..A].copy_from_slice(&arr);
b
}
Playground
You can use concat_arrays macro for it:
use concat_arrays::concat_arrays;
fn main() {
let arr: [u8; 8] = concat_arrays!([97, 112, 112, 108, 101], [0; 3]);
println!("{:?}", arr);
}
I don't think it's possible to do without external dependencies.
You could start with zeros and set the initial values afterwards. This requires arr to be mut though.
fn main() {
let mut arr: [u8;8] = [0;8];
let init = [97, 112, 112, 108, 101];
arr[..init.len()].copy_from_slice(&init);
println!("Len is {}",arr.len());
println!("Elements are {:?}",arr);
}
Link to Playground

What is the Rust equivalent of a JavaScript object when encoding with msgpack?

I'm trying to port a JavaScript library which uses msgpack for encoding JavaScript objects to Rust. I found a Rust library for msgpack encoding/decoding, but I don't get what is the equivalent input format in Rust.
This JavaScript code for encoding the object {"a": 5, "b": 6}
gives the output 82 a1 61 03 a1 62 05:
const msgpack = require("msgpack-lite");
msgpack.encode(obj);
I tried representing the object as a Rust struct and encoding it using rmp-serde library
use rmp_serde::{Deserializer, Serializer};
use serde::{Deserialize, Serialize};
#[derive(Debug, Serialize, Deserialize)]
pub struct Test {
a: u32,
b: u32,
}
fn main() {
let mut buf = Vec::new();
let val = Test { a: 3, b: 5 };
val.serialize(&mut Serializer::new(&mut buf)).unwrap();
println!("{:?}", buf);
}
I get the output [146, 3, 5]. How do I represent JSON input in Rust?
What is the Rust equivalent of a JavaScript object
That is a HashMap:
use rmp_serde::{Deserializer, Serializer, encode::StructMapWriter};
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
#[derive(Debug, Serialize, Deserialize)]
pub struct Test {
a: u32,
b: u32,
}
fn main() {
let mut buf = Vec::new();
let mut val = HashMap::new();
val.insert("a", 3);
val.insert("b", 5);
val.serialize(&mut Serializer::new(&mut buf)).unwrap();
println!("{:x?}", buf);
let test: Test = Deserialize::deserialize(&mut Deserializer::new(&buf[..])).unwrap();
println!("{:?}", test);
buf.clear();
test.serialize(&mut Serializer::with(&mut buf, StructMapWriter))
.unwrap();
println!("{:x?}", buf);
}
This gives the expected output:
[82, a1, 61, 3, a1, 62, 5]
Test { a: 3, b: 5 }
[82, a1, 61, 3, a1, 62, 5]
As you can see, you can deserialize into something other than a HashMap but serialization will not produce the same thing because you "lost" the information that it was a HashMap. The default of rmp is to use compact serialization ("This is the default constructor, which returns a serializer that will serialize structs using compact tuple representation, without field names."), but you can tell to rmp to serialize it differently if you need to with StructMapWriter.

Parsing variable from delimited file

I have some file content which delimited by pipe | symbol. Named, important.txt.
1|130|80|120|110|E
2|290|420|90|70|B
3|100|220|30|80|C
Then, I use Rust BufReader::split to read its content.
use std::error::Error;
use std::fs::File;
use std::io::BufReader;
use std::io::Prelude::*;
use std::path::Path;
fn main() {
let path = Path::new("important.txt");
let display = path.display();
//Open read-only
let file = match File::open(&path) {
Err(why) => panic!("can't open {}: {}", display,
Error::description(why)),
Ok(file) => file,
}
//Read each line
let reader = BufReader::new(&file);
for vars in reader.split(b'|') {
println!("{:?}\n", vars.unwrap());
}
}
The problem is, vars.unwrap() would return bytes instead of string.
[49]
[49, 51, 48]
[56, 48]
[49, 50, 48]
[49, 49, 48]
[69, 10, 50]
[50, 57, 48]
[52, 50, 48]
[57, 48]
[55, 48]
[66, 10, 51]
[49, 48, 48]
[50, 50, 48]
[51, 48]
[56, 48]
[67, 10]
Do you have any ideas how to parse this delimited file into variable in Rust?
Since your data is line-based, you can use BufRead::lines:
use std::io::{BufReader, BufRead};
fn main() {
let input = r#"1|130|80|120|110|E
2|290|420|90|70|B
3|100|220|30|80|C
"#;
let reader = BufReader::new(input.as_bytes());
for line in reader.lines() {
for value in line.unwrap().split('|') {
println!("{}", value);
}
}
}
This gives you an iterator over Strings for each line in the input. You then use str::split to get the pieces.
Alternatively, you can take the &[u8] you already have and make a string from it with str::from_utf8:
use std::io::{BufReader, BufRead};
use std::str;
fn main() {
let input = r#"1|130|80|120|110|E
2|290|420|90|70|B
3|100|220|30|80|C
"#;
let reader = BufReader::new(input.as_bytes());
for vars in reader.split(b'|') {
let bytes = vars.unwrap();
let s = str::from_utf8(&bytes).unwrap();
println!("{}", s);
}
}
You may also want to look into the csv crate if you are reading structured data like a CSV that just happens to be pipe-delimited.

Resources