What is the purpose of matrix in quote_each_token? - rust

I just can't take in the meaning of using matrices within the invocation of quote! provided by the corresponding crate.
in accordance with https://docs.rs/quote/latest/src/quote/lib.rs.html#473
#[macro_export]
macro_rules! quote {
() => {
$crate::__private::TokenStream::new()
};
($($tt:tt)*) => {{
let mut _s = $crate::__private::TokenStream::new();
$crate::quote_each_token!(_s $($tt)*);
_s
}};
}
as it is shown above, it calls the next macro:
#[macro_export]
#[doc(hidden)]
macro_rules! quote_each_token {
($tokens:ident $($tts:tt)*) => {
$crate::quote_tokens_with_context!($tokens
(# # # # # # $($tts)*)
(# # # # # $($tts)* #)
(# # # # $($tts)* # #)
(# # # $(($tts))* # # #)
(# # $($tts)* # # # #)
(# $($tts)* # # # # #)
($($tts)* # # # # # #)
);
};
}
As you can see there is a sequence of nested macros, but I'm somewhat lost at the beginning of this, since I'm in the dark about any general idea which may be behind the code and what it may be useful for. As far as I'm concerned, it's nonsense to construct matrices of strings. I have read a description of it in comments provided along with the source code. But it's all greek to me.
So I appeal to anyone to spell it out, please.

Related

Rust: What does the `#` (at sign) operator do?

I saw the following line in my code, and I am not sure what it does as I haven't encountered the # operator before.
if let e#Err(_) = changed {
...
}
Can this line be written without the # operator, what would that look like?
It's a way to bind the matched value of a pattern to a variable(using the syntax: variable # subpattern). For example,
let x = 2;
match x {
e # 1 ..= 5 => println!("got a range element {}", e),
_ => println!("anything"),
}
According to https://doc.rust-lang.org/book/ch18-03-pattern-syntax.html#-bindings and https://doc.rust-lang.org/reference/patterns.html#identifier-patterns, it is used for simultaneously matching to the pattern to the right of the # while also binding the value to the identifier to the left of the #.
To answer your second question, Yes, it would look like
if let Err(_) = &changed {
// continue to use `changed` like you would use `e`
}
Note that in order to continue to use changed inside the body, you need to match for the reference &changed. Otherwise it will be moved and discarded (unless it happens to be Copy).

what the sign "#" mean in Rust expression [duplicate]

This question already has an answer here:
What does the '#' symbol do in Rust?
(1 answer)
Closed 2 years ago.
I'm new in Rust and while learning the standard library I found the expression Some(x # 6..=10) in a match example. What does x # 6..=10 mean?
something # pattern is a way to do the pattern matching. Normally a match branch creates variables for parts of the matched value. But something # pattern creates a variable something and moves or copies the whole value into it.
The syntax low ..= high allows you to match on all numbers within the range from low to high (inclusive). The # in pattern match expressions allows you to bind a variable to the matched value. Here's an example to illustrate their uses:
fn log_num(maybe_num: Option<i32>) {
match maybe_num {
None => println!("didn't get a number"),
Some(x # 0..=5) => println!("got number {} between 0-5", x),
Some(y # 6..=10) => println!("got number {} between 6-10", y),
Some(z) => println!("got number {} larger than 10", z),
}
}
fn main() {
log_num(None);
log_num(Some(3));
log_num(Some(7));
log_num(Some(15));
}
playground
This is a form of pattern matching, specifically # bindings. In the form...
let y = Some(3);
if let Some(x # 6..=10) = y {
// ...
}
... the variable y needs to be a Some(...) to match, and the inner value will be assigned to x if it is within the range 6 to 10 (inclusive). In the example above, the if-block will not be executed, because while y destructures to a Some(...), the inner value does not fit the pattern and therefor x does not bind.

NLP: Stemming on opcodes data set

I have a dataset of 27 files, each containing opcodes. I want to use stemming to map all versions of similar opcodes into the same opcode. For example: push, pusha, pushb, etc would all be mapped to push;
addf addi to add, multi multf to mult, etc.). How can I do so? I tried using PorterStemmer with NLTK extensions but it is not working on my dataset. I think it works only on normal human lingual words. (Like played, playing --> play) and not on these opcodes like (pusha, pushb --> push).
I don't think a stemming is what you want to do here. Stemmers are language specific and are based on the common inflectional morphological patterns in that language. For example, in English, you have the infinitival forms of verbs (e.g., "to walk") which becomes inflected for tense, aspect, & person/number: I walk vs. She walks (walk+s), I walk vs. walked (walk+ed), also walk+ing, etc. Stemmers codify these stochastic distributions into "rules" that are then applied on a "word" to change into its stem. In other words, an off-the-shelf stemmer does not exist for your opcodes.
You have two possible solutions: (1) create a dictionary or (2) write your own stemmer. If you don't have too many variants to map, it is probably quickest to just create a custom dictionary where you use all your word variants as keys and the lemma/stem/canonical-form is the value.
addi -> add
addf -> add
multi -> mult
multf -> mult
If your potential mappings are too numerous to do by hand, then you could write a custom regex stemmer to do the mapping and conversion. Here is how you might do it in R. The following function takes an input word and tries to match it to a pattern representing all the variants of a stem, for all the n stems in your collection. It returns a 1 x n data.frame with 1 indicating presence or 0 indicating absence of variant match.
#' Return word's stem data.frame with each column indicating presence (1) or
#' absence (0) of stem in that word.
map_to_stem_df <- function(word) {
## named list of patterns to match
stem_regex <- c(add = "^add[if]$",
mult = "^mult[if]$")
## iterate across the stem names
res <- lapply(names(stem_regex), function(stem) {
pat <- stem_regex[stem]
## if pattern matches word, then 1 else 0
if (grepl(pattern = pat, x = word)) {
pat_match <- 1
} else {
pat_match <- 0
}
## create 1x1 data.frame for stem
df <- data.frame(pat_match)
names(df) <- stem
return(df)
})
## bind all cols into single row data.frame 1 x length(stem_regex) & return
data.frame(res)
}
map_to_stem_df("addi")
# add mult
# 1 0
map_to_stem_df("additional")
# add mult
# 0 0

Selecting output lines in chunk output by modifying the default output hook

The knitr book, p. 118, \S 12.3.5, has an example of how to suppress long output by modifying
the output chunk hook, but it isn't at all general because it does it globally for all chunks.
I've tried to generalize that, to allow a chunk option, output.lines, which, if NULL, has no
effect, but otherwise selects and prints only the first output.lines lines. However, this version
seems to have no effect when I try it, and I can't figure out how to tell why.
More generally, I think this is useful enough to be included in knitr, and would be better if one
could specify a range of lines, e.g., output.lines=3:15, as is possible with echo=.
# get the default output hook
hook_output <- knit_hooks$get("output")
knit_hooks$set(output = function(x, options) {
lines <- options$output.lines
if (is.null(lines)) {
hook_output(x, options) # pass to default hook
}
else {
x <- unlist(stringr::str_split(x, "\n"))
if (length(x) > lines) {
# truncate the output, but add ....
x <- c(head(x, lines), "...\n")
}
# paste these lines together
x <- paste(x, collapse = "\n")
hook_output(x, options)
}
})
Test case:
<<print-painters, output.lines=8>>=
library(MASS)
painters
#
Actually, this solution does work. My actual test example was flawed. Maybe others will find this helpful.

Case Insensitive Pattern Matching over String Lists

I'm trying to parse command line arguments in an F# application. I'm using pattern matching over parameters list to accomplish it. Something like:
let rec parseCmdLnArgs =
function
| [] -> { OutputFile = None ; OtherParam = None }
| "/out" :: fileName :: rest -> let parsedRest = parseCmdLnArgs rest
{ OutputFile = Some(fileName) with parsedRest }
The problem is I want to make "/out" match case insensitive while preserving the case of other stuff. That means I can't alter the input and match the lowercase version of the input against it (this will lose the fileName case information).
I have thought about several solutions:
Resort to when clauses which is less than ideal.
Match a tuple each time, the first would be the actual parameter (which I'll just save for further processing and will wildcard match it) and the second would be the lowercased version used in such matchings. This looks worse than the first.
Use active patterns but that looks too verbose. I'll have to repeat things like ToLower "/out" before every item.
Is there a better option/pattern for doing these kind of stuff? I think this is a common problem and there should be a good way to handle it.
I quite like your idea of using F# active patterns to solve this. It is a bit more verbose than using pre-processing, but I think it's quite elegant. Also, according to some BCL guidelines, you shouldn't be using ToLower when comparing strings (ignoring the case). The right approach is to use OrdinalIgnoreCase flag. You can still define a nice active pattern to do this for you:
open System
let (|InvariantEqual|_|) (str:string) arg =
if String.Compare(str, arg, StringComparison.OrdinalIgnoreCase) = 0
then Some() else None
match "HellO" with
| InvariantEqual "hello" -> printfn "yep!"
| _ -> printfn "Nop!"
You're right that it's more verbose, but it nicely hides the logic and it gives you enough power to use the recommended coding style (I'm not sure how this could be done using pre-processing).
I might do some pre-processing to allow for either "-" or "/" at the beginning of keywords, and to normalize the case:
let normalize (arg:string) =
if arg.[0] = '/' || arg.[0] = '-' then
("-" + arg.[1..].ToLower())
else arg
let normalized = args |> List.map normalize
It's perhaps not ideal, but it's not like any user is going to have enough patience to type so many command-line parameters that looping through them twice is noticeably slow.
You can use guards to match your deal:
let rec parseCmdLnArgs =
function
| [] -> { OutputFile = None ; OtherParam = None }
| root :: fileName :: rest when root.ToUpper() = "/OUT" -> let parsedRest = parseCmdLnArgs rest
{ OutputFile = Some(fileName) with parsedRest }
Ran into this looking for a solution to a similar issue, and while Tomas' solution works for individual strings, it doesn't help with the original issue of pattern matching against lists of strings. A modified version of his active pattern allows matching lists:
let (|InvariantEqual|_|) : string list -> string list -> unit option =
fun x y ->
let f : unit option -> string * string -> unit option =
fun state (x, y) ->
match state with
| None -> None
| Some() ->
if x.Equals(y, System.StringComparison.OrdinalIgnoreCase)
then Some()
else None
if x.Length <> y.Length then None
else List.zip x y |> List.fold f (Some())
match ["HeLlO wOrLd"] with
| InvariantEqual ["hello World";"Part Two!"] -> printfn "Bad input"
| InvariantEqual ["hello WORLD"] -> printfn "World says hello"
| _ -> printfn "No match found"
I haven't been able to figure out how to make it match with placeholders properly to do | InvariantEqual "/out" :: fileName :: rest -> ... yet, but if you know the entire contents of the list, it's an improvement.

Resources