I'm attempting to use hcl-rs = 0.7.0 to parse some HCL. I'm just experimenting with arbitrary HCL, so I'm not looking to parse terraform specific code.
I'd like to be able to parse a block like this and get it's label as part of result
nested_block "nested_block_label" {
foo = 123
}
This currently doesn't work, but hopefully it shows my intention. Is something like this possible?
#[test]
fn deserialize_struct_with_label() {
#[derive(Deserialize, PartialEq, Debug)]
struct TestRoot {
nested_block: TestNested,
}
#[derive(Deserialize, PartialEq, Debug)]
struct TestNested {
label: String,
foo: u32,
}
let input = r#"
nested_block "nested_block_label" {
foo = 123
}"#;
let expected = TestRoot{ nested_block: TestNested { label: String::from("nested_block_label"), foo: 123 } };
assert_eq!(expected, from_str::<TestRoot>(input).unwrap());
}
Your problem is that hcl by default seems to interpret
nested_block "nested_block_label" {
foo = 123
}
as the following "serde structure":
"nested_block" -> {
"nested_block_label" -> {
"foo" -> 123
}
}
but for your Rust structs, it would have to be
"nested_block" -> {
"label" -> "nested_block_label"
"foo" -> 123
}
I'm not aware of any attributes that would allow you to bend the former into the latter.
As usual, when faced with this kind of situation, it is often easiest to first deserialize to a generic structure like hcl::Block and then convert to whatever struct you want manually. Disadvantage is that you'd have to do that for every struct separately.
You could, in theory, implement a generic deserialization function that wraps the Deserializer it receives and flattens the two-level structure you get into the one-level structure you want. But implementing Deserializers requires massive boilerplate. Possibly, somebody has done that before, but I'm not aware of any crate that would help you here, either.
As a sort of medium effort solution, you could have a special Labelled wrapper structure that always catches and flattens this intermediate level:
#[derive(Debug, PartialEq)]
struct Labelled<T> {
label: String,
t: T,
}
impl<'de, T: Deserialize<'de>> Deserialize<'de> for Labelled<T> {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: serde::Deserializer<'de>,
{
struct V<T>(std::marker::PhantomData<T>);
impl<'de, T: Deserialize<'de>> serde::de::Visitor<'de> for V<T> {
type Value = Labelled<T>;
fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>
where
A: serde::de::MapAccess<'de>,
{
if let (Some((label, t)), None) =
(map.next_entry()?, map.next_entry::<String, ()>()?)
{
Ok(Labelled { label, t })
} else {
Err(serde::de::Error::invalid_type(
serde::de::Unexpected::Other("Singleton map"),
&self,
))
}
}
}
deserializer.deserialize_map(V::<T>(Default::default()))
}
}
would be used like this:
#[derive(Deserialize, PartialEq, Debug)]
struct TestRoot {
nested_block: Labelled<TestNested>,
}
#[derive(Deserialize, PartialEq, Debug)]
struct TestNested {
foo: u32,
}
Lastly, there may be some trick like adding #[serde(rename = "$hcl::label")]. Other serialization libraries (e.g. quick-xml) have similar and allow marking fields as something special this way. hcl-rs does the same internally, but it's undocumented and I can't figure out from the source whether what you need is possible.
Related
How does one add a limit over length of a String (Vec or Array) when deserializing using serde ?
Best guess is:
use serde::Deserialize;
use arrayvec::ArrayString;
#[derive(Deserialize)]
pub struct Foo {
#[serde(try_from = "ArrayString<4096>")] // FAILS!
bar: String,
}
Which does not work because try_from attribute is only available for "containers" and not "fields".
If there is a way to make this solution work, the main concerns are efficiency and security. Is this solution fast (not dramatically slowing down the deserialization process for short String) and safe (does it prevents the memory from being used up by a malicious request), and is it idiomatic ?
As pointed out by #cafce25, by the point your Deserialize impl gets the string, it is already fully parsed from the input stream. This is just how serde works, and thus this restriction will not provide any protection.
You must implement this kind of restriction at the request level, or with your own parser.
If you still wish to implement a restriction like this, it's pretty simple to do with a newtype that delegates to String::deserialize:
use std::ops::Deref;
use serde::de;
use serde::Deserialize;
struct LimitedString<const MAX_LENGTH: usize>(String);
impl<const MAX_LENGTH: usize> Deref for LimitedString<MAX_LENGTH> {
type Target = String;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl<'de, const MAX_LENGTH: usize> de::Deserialize<'de> for LimitedString<MAX_LENGTH> {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: de::Deserializer<'de>,
{
<String as de::Deserialize>::deserialize(deserializer).and_then(|inner| {
if inner.len() > MAX_LENGTH {
Err(de::Error::invalid_length(inner.len(), &"an integer lower than the maximum"))
} else {
Ok(Self(inner))
}
})
}
}
#[derive(Deserialize)]
pub struct Foo {
bar: LimitedString<4096>,
}
playground
I am trying to understand following enum from this repo
#[repr(C)]
#[derive(BorshSerialize, BorshDeserialize, Debug, Clone)]
pub struct InitEscrowArgs {
pub data: EscrowReceive,
}
#[repr(C)]
#[derive(BorshSerialize, BorshDeserialize, Debug, Clone)]
pub struct ExchangeArgs {
pub data: EscrowReceive,
}
#[derive(BorshSerialize, BorshDeserialize, Clone)]
pub enum EscrowInstruction {
InitEscrow(InitEscrowArgs),
Exchange(ExchangeArgs),
CancelEscrow(),
}
and it's use of it in this match from this repo.
pub fn process(
program_id: &Pubkey,
accounts: &[AccountInfo],
instruction_data: &[u8],
) -> ProgramResult {
let instruction = EscrowInstruction::try_from_slice(instruction_data)?;
match instruction {
EscrowInstruction::InitEscrow(args) => {
msg!("Instruction: Init Escrow");
Self::process_init_escrow(program_id, accounts, args.data.amount)
}
EscrowInstruction::Exchange(args) => {
msg!("Instruction: Exchange Escrow");
Self::process_exchange(program_id, accounts, args.data.amount)
}
EscrowInstruction::CancelEscrow() => {
msg!("Instruction: Cancel Escrow");
Self::process_cancel(program_id, accounts)
}
}
}
I understand that this try_from_slice method gets some sort of byte array and deserialize it.
I do not understand how it determines which enum value to use.
The enum has 3 choices, InitEscrow / Exchange / CancelEscrow, but what determines the match to know which one it is suppose to select?
Seem to me the InitEscrowArgs and ExchangeArgs both takes in same struct. Both containing data that is EscrowReceive data type.
Method try_from_slice is part of the BorshDeserialize trait, which is derived on the enum in question. So, the choice between enum variants is made by the implementation of deserializer.
To see what is really going on, I've built the simplest possible example:
use borsh::BorshDeserialize;
#[derive(BorshDeserialize)]
enum Enum {
Variant1(u8),
Variant2,
}
By using cargo expand and a little manual cleanup, we can get the following equivalent code:
impl borsh::de::BorshDeserialize for Enum {
fn deserialize(buf: &mut &[u8]) -> Result<Self, std::io::Error> {
let variant_idx: u8 = borsh::BorshDeserialize::deserialize(buf)?;
let return_value = match variant_idx {
0u8 => Enum::Variant1(borsh::BorshDeserialize::deserialize(buf)?),
1u8 => Enum::Variant2,
_ => {
let msg = format!("Unexpected variant index: {}", variant_idx);
return Err(std::io::Error::new(
std::io::ErrorKind::InvalidInput,
msg,
));
}
};
Ok(return_value)
}
}
Where the inner deserialize calls refers to impl BorshDeserialize for u8:
fn deserialize(buf: &mut &[u8]) -> Result<Self> {
if buf.is_empty() {
return Err(Error::new(
ErrorKind::InvalidInput,
ERROR_UNEXPECTED_LENGTH_OF_INPUT,
));
}
let res = buf[0];
*buf = &buf[1..];
Ok(res)
}
So, it works the following way:
Deserializer tries to pull one byte from input; if there's none - this is an error.
This byte is interpreted as an index of enum variant; if it doesn't match to one of variants - this is an error.
If the variant contains any data, deserializer tries to pull this data from the input; if it fails (according to the inner type's implementation) - this is an error.
I'm creating some User Input components in Rust. I'm trying to modularize these components using trait CustomInput. This trait implements some basic for all UI Input components, like being able to set and get the values of the component. All my UI components will take a String input, and parse that string to some typed value, maybe an f64, i32, or Path. Because different UI components can return different types, I created enum CustomOutputType, so that get_value can have a single return type for all components.
As users can input any random string, each UI component needs to have a means of ensuring that the input string can be successfully converted to the right type for the respective input component, I.e MyFloatInput needs to return a f64 value, MyIntInput needs to return an i32, and so on. I see that std::str::FromStr is implemented for most basic types, bool, f64, usize, but I don't think this kind of parsing is robust enough for UI inputs.
I'd like some more advanced functionality than std::str::FromStr provides, I'd like to be able to:
Evaluate expression, i.e input of 500/25 returns 20
Floor/Ceiling for min/max values: i.e if max is 1000 and input is 2000, value is 1000
What would be an idiomatic way of implementing such parsing
functionality for my UI components?
Is using the CustomOutputType enum an efficient strategy for handling the
different types the UI Components could return?
Would a parsing crate like Nom be advisable in this situation? I think this might be
overkill for what I'm doing, but I'm not sure.
I've taken a stab at creating such a system:
use std::str::FromStr;
use snafu::{Snafu};
#[derive(Debug, Snafu)]
pub enum Error{
#[snafu(display("Incorrect Type: {:?}", msg))]
IncorrectInputType{msg: String},
}
type Result<T, E = Error> = std::result::Result<T, E>;
#[derive(Clone, Debug)]
pub enum CustomOutputType{
CFloat(f64),
CInt(i32),
}
pub trait CustomInput{
fn get_value(&self)->CustomOutputType;
fn set_value(&mut self, val: String)->Result<()>;
}
pub struct MyFloatInput{
val: f64,
}
pub struct MyIntInput{
val: i32
}
impl CustomInput for MyFloatInput{
fn get_value(&self)->CustomOutputType{
CustomOutputType::CFloat(self.val)
}
fn set_value(&mut self, val: String)->Result<()>{
match f64::from_str(&val) {
Ok(inp)=>Ok(self.val = inp),
// Ok(inp)=>{self.val = inp}
Err(e) => IncorrectInputType { msg: "setting float input failed".to_string() }.fail()?
}
}
}
impl CustomInput for MyIntInput{
fn get_value(&self)->CustomOutputType{
CustomOutputType::CInt(self.val)
}
fn set_value(&mut self, val: String)->Result<()>{
match i32::from_str(&val) {
Ok(inp)=>Ok(self.val = inp),
Err(e) => IncorrectInputType { msg: "setting int input failed".to_string() }.fail()?
}
}
}
fn main() {
let possible_inputs = vec!["100".to_string(), "100.0".to_string(), "100/2".to_string(), ".1".to_string(), "teststr".to_string()];
let mut int_input = MyIntInput{val: 0};
let mut float_input = MyFloatInput{val: 0.0};
for x in &possible_inputs{
int_input.set_value(x.to_string()).unwrap();
float_input.set_value(x.to_string()).unwrap();
}
}
I'm using [prost] to generate structs from protobuf. One of those structs is quite simple:
enum Direction {
up = 0;
down = 1;
sideways = 2;
}
This generates code which looks like:
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]
#[repr(i32)]
#[derive(serde_derive::Deserialize)]
pub enum Direction {
Up = 0,
Down = 1,
Sideways = 2,
}
There's a significant number of JSON files I have to parse into these messages. Those are tens of thousands of lines long, but when this field appears it looks like:
{ "direction": "up" }
So, in short, its deserialized format is a string, serialized is i32.
If I just run that, and try to parse JSON, I get:
thread 'tests::parse_json' panicked at 'Failed to parse: "data/my_data.json": Error("invalid type: string \"up\", expected i32", line: 132, column: 23)
That, of course, makes sense - there's no reflection to guide deserialization from "up" to 0.
Question: How can I set up serde to parse those strings into their matching integer values? I've read the serde docs thoroughly, and it appears I might need to write a custom deserializer for this, although that seems like overkill.
I've tried a few different serde attributes, like:
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]
#[repr(i32)]
#[derive(serde_derive::Deserialize)]
#[serde(from = "&str")] // This line
pub enum Direction {
Up = 0,
Down = 1,
Sideways = 2,
}
with this function:
impl From<&str> for Direction {
fn from(item: &str) -> Self {
match item {
"up" => Self::Up,
"down" => Self::Down,
"sideways" => Self::Sideways,
_ => panic!("Invalid value for Direction: {}", item),
}
}
}
but, despite what the serde docs tell me, that method doesn't even get called (but compilation succeeds).
I also tried a field attribute on the field which is a Direction:
#[serde(deserialize_with = \"super::super::common::Direction::from\")]
but that of course wants a different signature than just &str: the trait std::convert::From<__D> is not implemented for common::Direction
Do I just have to write a custom deserializer? Seems like a common enough use case that there would be a pattern to use.
Note: this is the opposite problem of that solved by serde_repr. I didn't see a way to put it to work here.
I implemented my own deserializer, thanks to the guide in this answer. There's likely a simpler or more idiomatic approach out there, so if you know one, please share!
Serde attribute, set on the field rather than the Enum:
config.field_attribute(
"direction",
"#[serde(deserialize_with = \"super::super::common::Direction::from_str\")]"
);
Deserializer:
impl Direction {
pub fn from_str<'de, D>(deserializer: D) -> Result<i32, D::Error>
where
D: Deserializer<'de>,
{
let s: &str = Deserialize::deserialize(deserializer)?;
match s.to_lowercase().as_str() {
"up" => Ok(Self::Tx as i32),
"down" => Ok(Self::Down as i32),
"sideways" => Ok(Self::Sideways as i32),
_ => Err(de::Error::unknown_variant(s, &["up", "down", "sideways"])),
}
}
}
I have around 10 structs with between 5-10 fields each and I want to be able to print them out using the same format.
Most of my structs look like this:
struct Example {
a: Option<String>,
b: Option<i64>,
c: Option<String>,
... etc
}
I would like to be able to define a impl for fmt::Display without having to enumerate the fields again so there is no chance for missing one if a new one is added.
For the struct:
let eg = Example{
a: Some("test".to_string),
b: Some(123),
c: None,
}
I would like the output format:
a: test
b: 123
c: -
I currently am using #[derive(Debug)] but I don't like that it prints out Some(X) and None and a few other things.
If I know that all the values inside my structs are Option<T: fmt::Display> can I generate myself a Display method without having to list the fields again?
This may not be the most minimal implementation, but you can derive serialisable and use the serde crate. Here's an example of a custom serialiser: https://serde.rs/impl-serializer.html
In your case it may be much simpler (you need only a handful of types and can panic/ignore on anything unexpected).
Another approach could be to write a macro and create your own lightweight serialisation solution.
I ended up solving this with a macro. While it is not ideal it does the job.
My macro currently looks like this:
macro_rules! MyDisplay {
($struct:ident {$( $field:ident:$type:ty ),*,}) => {
#[derive(Debug)]
pub struct $struct { pub $($field: $type),*}
impl fmt::Display for $struct {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
$(
write!(f, "{}: {}\n",
stringify!($field).to_string(),
match &self.$field {
None => "-".to_string(),
Some(x) => format!("{:#?}", x)
}
)?;
)*
Ok(())
}
}
};
}
Which can be used like this:
MyDisplay! {
Example {
a: Option<String>,
b: Option<i64>,
c: Option<String>,
}
}
Playground with an example:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=cc089f8aecaa04ce86f3f9e0307f8785
My macro is based on the one here https://stackoverflow.com/a/54177889/1355121 provided by Cerberus