Why does this rust HashMap macro no longer work? - rust

I previously used:
#[macro_export]
macro_rules! map(
{ T:ident, $($key:expr => $value:expr),+ } => {
{
let mut m = $T::new();
$(
m.insert($key, $value);
)+
m
}
};
)
To create objects, like this:
let mut hm = map! {HashMap, "a string" => 21, "another string" => 33};
However, this no longer seems to work. The compiler reports:
- Failed:
macros.rs:136:24: 136:31 error: no rules expected the token `HashMap`
macros.rs:136 let mut hm = map! {HashMap, "a string" => 21, "another string" => 33};
^~~~~~~
What's changed with macro definitions that makes this no longer work?
The basic example below works fine:
macro_rules! foo(
{$T:ident} => { $T; };
)
struct Blah;
#[test]
fn test_map_create() {
let mut bar = foo!{Blah};
}
So this seem to be some change to how the {T:ident, $(...), +} expansion is processed?
What's going on here?

You’re lacking the $ symbol before T.

How about this play.rust-lang.org?
// nightly rust
#![feature(type_name_of_val)]
use std::collections::{BTreeMap, HashMap};
macro_rules! map {
($T:ty; $( $key:literal : $val:expr ),* ,) => {
{
// $T is the full HashMap<String, i32>, <$T> is just HashMap
let mut m: $T = <$T>::new();
$(
m.insert($key.into(), $val);
)*
m
}
};
// handle no tailing comma
($T:ty; $( $key:literal : $val:expr ),*) => {
map!{$T; $( $key : $val ,)*}
}
}
fn main() {
let hm = map! {HashMap<String, i32>;
"a": 1,
"b": 2,
"c": 3,
};
let bm = map! {BTreeMap<String, i32>;
"1": 1,
"2": 2,
"3": 3
};
println!("typeof hm = {}", std::any::type_name_of_val(&hm));
println!("typeof bm = {}", std::any::type_name_of_val(&bm));
dbg!(hm, bm);
}
Additional, use {}, macro_rules map {} means it will return an item

Related

How to disambiguate against types with Vec<T> in macro_rules in Rust?

Let's say I have
macro_rules! tipey {
(Vec<$pt: ty>) => {2};
($pt: ty) => {1};
}
macro_rules! structy {
(struct $i: ident { $($p: ident : $(Vec<)? $pt: ty $(>)?,)+ }) => {
const v: &[usize] = &[ $(tipey!($pt)),+ ];
};
}
structy!(
struct ContentDetails {
pattern: String,
fields: Vec<String>,
}
);
I want to somehow be able to disambiguate the type and know whether it is a Vec<> or a simple type. I'm only dealing with Vecs so no need to expand if not possible.
The issue I have is that if I match Vec<bool> against just $t: ty then I cannot split it up later to see if the $t was Vec<> or not but if I try to collect multiple tts or something else then parsing the list of properties breaks. I really want to avoid having to use proc macros
This is going to be very unreliable for general types, and generic Rust syntax. But if you have a very narrow use-case then you can fix up your code something like this:
macro_rules! tipey {
(Vec<$pt: tt>) => { 2 };
($pt: tt) => { 1 };
}
macro_rules! structy {
(struct $i: ident {
$($p: ident: $pt: tt $(<$gt: tt>)?),+
$(,)?
}) => {
const v: &[usize] = &[ $(tipey!( $pt $(<$gt>)?)),+ ];
};
}

Keep the LocatedSpan of an "outer" parser with nom in Rust

I am trying to write a parser using the nom crate (and the nom_locate) that can parse strings such as u{12a}, i.e.:
u\{([0-9a-fA-F]{1,6})\}
I wrote the following parser combinator:
use nom::bytes::complete::{take_while_m_n};
use nom::character::complete::{char};
use nom::combinator::{map_opt, map_res};
use nom::sequence::{delimited, preceded};
pub type LocatedSpan<'a> = nom_locate::LocatedSpan<&'a str>;
pub type IResult<'a, T> = nom::IResult<LocatedSpan<'a>, T>;
#[derive(Clone, Debug)]
pub struct LexerError<'a>(LocatedSpan<'a>, String);
fn expect<'a, F, E, T>(
mut parser: F,
err_msg: E,
) -> impl FnMut(LocatedSpan<'a>) -> IResult<Option<T>>
where
F: FnMut(LocatedSpan<'a>) -> IResult<T>,
E: ToString,
{
use nom::error::Error as NomError;
move |input| match parser(input) {
Ok((remaining, output)) => Ok((remaining, Some(output))),
Err(nom::Err::Error(NomError { input, code: _ }))
| Err(nom::Err::Failure(NomError { input, code: _ })) => {
let err = LexerError(input, err_msg.to_string());
// TODO Report error.
println!("error: {:?}", err);
Ok((input, None))
}
Err(err) => Err(err),
}
}
fn lit_str_unicode_char(input: LocatedSpan) -> IResult<char> {
let parse_hex = take_while_m_n(1, 6, |c: char| c.is_ascii_hexdigit());
// FIXME Figure out a way to keep correct span here.
let parse_delim_hex = preceded(
char('u'),
delimited(
char('{'),
expect(parse_hex, "expected 1-6 hex digits"),
expect(char('}'), "expected closing brace"),
),
);
let parse_u32 = map_res(parse_delim_hex, move |hex| match hex {
None => Err("cannot parse number"),
Some(hex) => match u32::from_str_radix(hex.fragment(), 16) {
Ok(val) => Ok(val),
Err(_) => Err("invalid number"),
},
});
map_opt(parse_u32, std::char::from_u32)(input)
}
fn main() {
let raw = "u{61}";
let span = LocatedSpan::new(raw);
let result = lit_str_unicode_char(span);
println!("{:#?}", result);
}
This works correctly, I am able to get the Unicode character out of the string. However, this approach does not keep the proper spans, i.e.:
u{123}
\..../ <--- the span I want
\/ <--- the span I get
I figured I could wrap the parse_delim_hex in a recognize, which would keep the span correctly, but then I couldn't use the following parsers to "understand" the digits.
How should I get around this issue?
I think you misunderstand the purpose of the first parameter of IResult.
Quote from the documentation:
The Ok side is a pair containing the remainder of the input (the part of the data that was not parsed) and the produced value.
The span you are looking at is not the data that was found, but instead the data that was left over afterwards.
I think what you were trying to achieve is something along those lines:
use nom::bytes::complete::take_while_m_n;
use nom::character::complete::char;
use nom::combinator::{map_opt, map_res};
use nom::{InputTake, Offset};
use nom::sequence::{delimited, preceded};
pub type LocatedSpan<'a> = nom_locate::LocatedSpan<&'a str>;
pub type IResult<'a, T> = nom::IResult<LocatedSpan<'a>, T>;
#[derive(Clone, Debug)]
pub struct LexerError<'a>(LocatedSpan<'a>, String);
fn expect<'a, F, E, T>(
mut parser: F,
err_msg: E,
) -> impl FnMut(LocatedSpan<'a>) -> IResult<Option<T>>
where
F: FnMut(LocatedSpan<'a>) -> IResult<T>,
E: ToString,
{
use nom::error::Error as NomError;
move |input| match parser(input) {
Ok((remaining, output)) => Ok((remaining, Some(output))),
Err(nom::Err::Error(NomError { input, code: _ }))
| Err(nom::Err::Failure(NomError { input, code: _ })) => {
let err = LexerError(input, err_msg.to_string());
// TODO Report error.
println!("error: {:?}", err);
Ok((input, None))
}
Err(err) => Err(err),
}
}
fn lit_str_unicode_char(input: LocatedSpan) -> IResult<(char, LocatedSpan)> {
let parse_hex = take_while_m_n(1, 6, |c: char| c.is_ascii_hexdigit());
// FIXME Figure out a way to keep correct span here.
let parse_delim_hex = preceded(
char('u'),
delimited(
char('{'),
expect(parse_hex, "expected 1-6 hex digits"),
expect(char('}'), "expected closing brace"),
),
);
let parse_u32 = map_res(parse_delim_hex, |hex| match hex {
None => Err("cannot parse number"),
Some(hex) => match u32::from_str_radix(hex.fragment(), 16) {
Ok(val) => Ok(val),
Err(_) => Err("invalid number"),
},
});
// Do the actual parsing
let (s, ch) = map_opt(parse_u32, std::char::from_u32)(input)?;
let span_offset = input.offset(&s);
let span = input.take(span_offset);
Ok((s, (ch, span)))
}
fn main() {
let span = LocatedSpan::new("u{62} bbbb");
let (rest, (ch, span)) = lit_str_unicode_char(span).unwrap();
println!("Leftover: {:?}", rest);
println!("Character: {:?}", ch);
println!("Parsed Span: {:?}", span);
}
Leftover: LocatedSpan { offset: 5, line: 1, fragment: " bbbb", extra: () }
Character: 'b'
Parsed Span: LocatedSpan { offset: 0, line: 1, fragment: "u{62}", extra: () }

How to parse TOML in Rust with unknown structure?

My configuration file has a large number of arbitrary key-value pairs in it, which I want to parse using the toml crate. However it seems as if the standard way is to use a given struct that fits the configuration file. How can I load the key-value pairs into a data structure like a map or an iterator of pairs, instead of having to specifiy the structure beforehand with a struct?
toml as a Value struct that can hold anything and that you can introspect dynamically in order to discover any content without forcing the usage of a specific structure.
use toml::Value;
fn show_value(
value: &Value,
indent: usize,
) {
let pfx = " ".repeat(indent);
print!("{}", pfx);
match value {
Value::String(string) => {
println!("a string --> {}", string);
}
Value::Integer(integer) => {
println!("an integer --> {}", integer);
}
Value::Float(float) => {
println!("a float --> {}", float);
}
Value::Boolean(boolean) => {
println!("a boolean --> {}", boolean);
}
Value::Datetime(datetime) => {
println!("a datetime --> {}", datetime);
}
Value::Array(array) => {
println!("an array");
for v in array.iter() {
show_value(v, indent + 1);
}
}
Value::Table(table) => {
println!("a table");
for (k, v) in table.iter() {
println!("{}key {}", pfx, k);
show_value(v, indent + 1);
}
}
}
}
fn main() {
let input_text = r#"
abc = 123
[def]
ghi = "hello"
jkl = [ 12.34, 56.78 ]
"#;
let value = input_text.parse::<Value>().unwrap();
show_value(&value, 0);
}
/*
a table
key abc
an integer --> 123
key def
a table
key ghi
a string --> hello
key jkl
an array
a float --> 12.34
a float --> 56.78
*/
You actually don't need to do anything special other than tell it to deserialize into a HashMap:
use std::collections::HashMap;
use toml;
fn main() {
let toml_data = r#"
foo = "123"
bar = "456"
"#;
let config: HashMap<String, String> = toml::from_str(toml_data).unwrap();
println!("{:?}", config);
}
Of course, since TOML and Rust are both typed, your keys all need to be the same type (String in this example), and it cannot handle tables, since it wouldn't know where in the map a table should go.
If you do have a couple tables, just add your maps as fields to a struct and that works just as simply:
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
use toml;
#[derive(Debug, Serialize, Deserialize)]
struct Config {
data_a: HashMap<String, String>,
data_b: HashMap<String, String>,
}
fn main() {
let toml_data = r#"
[data_a]
foo = "123"
bar = "456"
[data_b]
bat = "123"
baz = "456"
"#;
let config: Config = toml::from_str(toml_data).unwrap();
println!("{:?}", config);
}
For what is is worth!
I came here to find a way to handle a toml-config file for my project.
This is what I've found:
You can parse an arbitrary toml file by using the Table type.
See the documentation.
All types can be automatically parsed but you cannot escape the fact that rust is typed. Therefore you have to parse the values into an expected type.
See my example:
use toml::Table;
fn main() {
//Load toml file
let path = std::path::Path::new("../Cargo.toml");
let file = match std::fs::read_to_string(path) {
Ok(f) => f,
Err(e) => panic!("{}", e),
};
let cfg: Table = file.parse().unwrap();
println!("Config in table format\n");
dbg!(&cfg);
println!("Index into config");
let cfg_string: &str = cfg["package"]["version"].as_str().unwrap();
println!("Version: {:?}", cfg_string);
let cfg_bool: bool = cfg["package"]["nest"]["nested_bool"].as_bool().unwrap();
println!("Nested bool: {:?}", cfg_bool);
// Default value if failed
let cfg_float: f64 = cfg["package"]["nest"]["nested_int"]
.as_float()
.unwrap_or(5.0);
println!("Default float to value: {:?}", cfg_float);
}
The toml-file
[package]
name = "rust_test"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
toml = "0.6.0"
[package.nest]
nested_float = 1.0
nested_int = 1
nested_bool = false
Output:
Config in table format
[src/main.rs:13] &cfg = {
"dependencies": Table(
{
"toml": String(
"0.6.0",
),
},
),
"package": Table(
{
"edition": String(
"2021",
),
"name": String(
"rust_test",
),
"nest": Table(
{
"nested_bool": Boolean(
false,
),
"nested_float": Float(
1.0,
),
"nested_int": Integer(
1,
),
},
),
"version": String(
"0.1.0",
),
},
),
}
Index into config
Version: "0.1.0"
Nested bool: false
Default float to value: 5.0

Rust macro: capture exactly matching tokens

My goal is to write a macro expand! such that:
struct A;
struct B;
struct Mut<T>;
expand!() => ()
expand!(A) => (A,)
expand!(mut A) => (Mut<A>,)
expand!(A, mut B) => (A, Mut<B>,)
// etc
[Edit] added trailing comma for consistent tuple syntax.
I wrote this macro so far:
macro_rules! to_type {
( $ty:ty ) => { $ty };
( mut $ty:ty ) => { Mut<$ty> };
}
macro_rules! expand {
( $( $(mut)? $ty:ty ),* ) => {
(
$( to_type!($ty) ),*
,)
};
}
What I'm struggling with, is capturing the mut token. How can I assign it to a variable and reuse it in the macro body? Is it possible to work on more than 1 token at a time?
Something like this?
macro_rules! expand {
(#phase2($($ty_final:ty),*),) => {
($($ty_final,)*)
};
(#phase2($($ty_final:ty),*), mut $ty:ty, $($rest:tt)*) => {
expand!(#phase2($($ty_final,)* Mut::<$ty>), $($rest)*)
};
(#phase2($($ty_final:ty),*), $ty:ty, $($rest:tt)*) => {
expand!(#phase2($($ty_final,)* $ty), $($rest)*)
};
($($t:tt)*) => {
expand!(#phase2(), $($t)*)
};
}
struct A;
struct B;
struct Mut<T>(std::marker::PhantomData<T>);
fn main() {
#[allow(unused_parens)]
let _: expand!() = ();
#[allow(unused_parens)]
let _: expand!(A,) = (A,);
#[allow(unused_parens)]
let _: expand!(mut B,) = (Mut::<B>(Default::default()),);
#[allow(unused_parens)]
let _: expand!(A, mut B,) = (A, Mut::<B>(Default::default()));
}

Why does matching on the result of Regex::find complain about expecting a struct regex::Match but found tuple?

I copied this code from Code Review into IntelliJ IDEA to try and play around with it. I have a homework assignment that is similar to this one (I need to write a version of Linux's bc in Rust), so I am using this code only for reference purposes.
use std::io;
extern crate regex;
#[macro_use]
extern crate lazy_static;
use regex::Regex;
fn main() {
let tokenizer = Tokenizer::new();
loop {
println!("Enter input:");
let mut input = String::new();
io::stdin()
.read_line(&mut input)
.expect("Failed to read line");
let tokens = tokenizer.tokenize(&input);
let stack = shunt(tokens);
let res = calculate(stack);
println!("{}", res);
}
}
#[derive(Debug, PartialEq)]
enum Token {
Number(i64),
Plus,
Sub,
Mul,
Div,
LeftParen,
RightParen,
}
impl Token {
/// Returns the precedence of op
fn precedence(&self) -> usize {
match *self {
Token::Plus | Token::Sub => 1,
Token::Mul | Token::Div => 2,
_ => 0,
}
}
}
struct Tokenizer {
number: Regex,
}
impl Tokenizer {
fn new() -> Tokenizer {
Tokenizer {
number: Regex::new(r"^[0-9]+").expect("Unable to create the regex"),
}
}
/// Tokenizes the input string into a Vec of Tokens.
fn tokenize(&self, mut input: &str) -> Vec<Token> {
let mut res = vec![];
loop {
input = input.trim_left();
if input.is_empty() { break }
let (token, rest) = match self.number.find(input) {
Some((_, end)) => {
let (num, rest) = input.split_at(end);
(Token::Number(num.parse().unwrap()), rest)
},
_ => {
match input.chars().next() {
Some(chr) => {
(match chr {
'+' => Token::Plus,
'-' => Token::Sub,
'*' => Token::Mul,
'/' => Token::Div,
'(' => Token::LeftParen,
')' => Token::RightParen,
_ => panic!("Unknown character!"),
}, &input[chr.len_utf8()..])
}
None => panic!("Ran out of input"),
}
}
};
res.push(token);
input = rest;
}
res
}
}
/// Transforms the tokens created by `tokenize` into RPN using the
/// [Shunting-yard algorithm](https://en.wikipedia.org/wiki/Shunting-yard_algorithm)
fn shunt(tokens: Vec<Token>) -> Vec<Token> {
let mut queue = vec![];
let mut stack: Vec<Token> = vec![];
for token in tokens {
match token {
Token::Number(_) => queue.push(token),
Token::Plus | Token::Sub | Token::Mul | Token::Div => {
while let Some(o) = stack.pop() {
if token.precedence() <= o.precedence() {
queue.push(o);
} else {
stack.push(o);
break;
}
}
stack.push(token)
},
Token::LeftParen => stack.push(token),
Token::RightParen => {
let mut found_paren = false;
while let Some(op) = stack.pop() {
match op {
Token::LeftParen => {
found_paren = true;
break;
},
_ => queue.push(op),
}
}
assert!(found_paren)
},
}
}
while let Some(op) = stack.pop() {
queue.push(op);
}
queue
}
/// Takes a Vec of Tokens converted to RPN by `shunt` and calculates the result
fn calculate(tokens: Vec<Token>) -> i64 {
let mut stack = vec![];
for token in tokens {
match token {
Token::Number(n) => stack.push(n),
Token::Plus => {
let (b, a) = (stack.pop().unwrap(), stack.pop().unwrap());
stack.push(a + b);
},
Token::Sub => {
let (b, a) = (stack.pop().unwrap(), stack.pop().unwrap());
stack.push(a - b);
},
Token::Mul => {
let (b, a) = (stack.pop().unwrap(), stack.pop().unwrap());
stack.push(a * b);
},
Token::Div => {
let (b, a) = (stack.pop().unwrap(), stack.pop().unwrap());
stack.push(a / b);
},
_ => {
// By the time the token stream gets here, all the LeftParen
// and RightParen tokens will have been removed by shunt()
unreachable!();
},
}
}
stack[0]
}
When I run it, however, it gives me this error:
error[E0308]: mismatched types
--> src\main.rs:66:22
|
66 | Some((_, end)) => {
| ^^^^^^^^ expected struct `regex::Match`, found tuple
|
= note: expected type `regex::Match<'_>`
found type `(_, _)`
It's complaining that I am using a tuple for the Some() method when I am supposed to use a token. I am not sure what to pass for the token, because it appears that the tuple is traversing through the Token options. How do I re-write this to make the Some() method recognize the tuple as a Token? I have been working on this for a day but I have not found any really good solutions.
The code you are referencing is over two years old. Notably, that predates regex 1.0. Version 0.1.80 defines Regex::find as:
fn find(&self, text: &str) -> Option<(usize, usize)>
while version 1.0.6 defines it as:
pub fn find<'t>(&self, text: &'t str) -> Option<Match<'t>>
However, Match defines methods to get the starting and ending indices the code was written assuming. In this case, since you only care about the end index, you can call Match::end:
let (token, rest) = match self.number.find(input).map(|x| x.end()) {
Some(end) => {
// ...

Resources