Velocity: Is it possible to nest macros that use ## and $bodyContent? - scope

I have a macro that looks essentially like this:
#macro( surround $x )
surround:$x
$bodyContent
/surround:$x
#end
Invocation ##surround("A")bunch o' stuff#end produces "surround:A bunch o' stuff /surround:A" as
expected. Invocation ##surround("A")##surround("B")more stuff#end#end produces
surround:A surround:B more stuff /surround:B /surround:A which is exactly what I want.
But now I want to build upwards with another macro
#macro( annotated-surround $x $y )
##surround( $x )
annotate:$y
$bodyContent
#end
#end
The intended expansion of #annotated-surround( "C" "note" ) stuff #end is
surround:C annotate:note stuff /surround:C
...but this doesn't work; I get the dreaded semi-infinite expansion of the annotated-surround body
content.
I have read the answer at Closure in Velocity template macros and still don't quite know whether what I want to do is possible.
I'm willing to do arbitrarily tricky things within the definitions of #surround and
#annotated-surround, but I don't want the users of those macros to see any complexity. The
whole idea is to simplify their lives.
As long as I have your ear: Setting macro.provide.scope.control=true is supposed to "a local namespace in macros". What does this mean? Is the provided namespace independent of the default context, but with a single such space shared among all invocations of all macros? Or is a separate context provided for each macro invocation, even recursively? It has to be the latter because of $macro.parent, right?
And yet another question. Consider the following macro:
#macro( recursive $x )
#if($x == 0)
zero
#else
$x before . . .
#set($xMinusOne = $x - 1)
#recursive($xMinusOne)
. . . $x after
#end
#end
#recursive( 4 ) yields:
4 before . . .
3 before . . .
2 before . . .
1 before . . .
zero . . .
0 after . . .
0 after . . .
0 after . . .
4 after
Now I understand all those occurrences of "0": there's only one global $x, so assigning to it on
the recursive calls smashes it and it doesn't get restored. But where on earth does that final "4"
come from? For that matter, how is it that my first "surround" macro works to arbitrary depth;
how come its final $x doesn't get smashed in inner calls?
Sorry to be so prolix, but I have been unable to find clear documentation in this matter.

The problem is the combination of global variables, a name collision, and lazy rendering.
Let's walk through the rendering process for ##annotated-surround( "x" "y" )content#end:
Rendering enters the annotated-surround macro. The context map contains:
$x = String x
$y = String y
$bodyContent = Renderable content - note that the String output of this has not yet been evaluated.
Rendering of the first line enters the surround macro. This updates the context map to:
new $x = old $x = String x
$y = String y
$bodyContent = Renderable annotate:$y\n$bodyContent - note that the String output of this still has not yet been evaluated, it's still template code.
Rendering outputs the first line of surround, producing the String surround:x.
Rendering begins evaluating the second line of surround, which references $bodyContent.
Rendering the first line of $bodyContent produces the String annotate:y.
Rendering begins evaluating the second line of $bodyContent, which references $bodyContent.
Rendering the first line of $bodyContent produces the String annotate:y.
Rendering begins evaluating the second line of $bodyContent, which references $bodyContent.
etc.
The solution is to remove part of the problem's combination. Global variables and lazy rendering are fundamental parts of how Velocity works, so you can't touch those. That leaves the name collision. What you need is for each macro's $bodyContent to be referred to with a different name. This is easily achieved by assigning it to new variables with unique names in each macro before invoking any other macros, and using the new variable in any invoked macro's body, like this:
#macro( surround $x )
surround:$x
$bodyContent
/surround:$x
#end
#macro( annotated-surround $x $y )
#set( $annotated-surround-content = $bodyContent )
##surround( $x )
annotate:$y
$annotated-surround-content
#end
#end
Rendering of this version goes like this:
Rendering enters the annotated-surround macro. The context map contains:
$x = String x
$y = String y
$bodyContent = Renderable content - note that the String output of this has not yet been evaluated.
Rendering of the first line executes the #set directive, adding a variable to the context map: $annotated-surround-content = current $bodyContent = Renderable content.
Rendering of the second line enters the surround macro. This updates the context map to:
new $x = old $x = String x
$y = String y
$annotated-surround-content = old $bodyContent = Renderable content
$bodyContent = Renderable annotate:$y\n$annotated-surround-content
Rendering outputs the first line of surround, producing the String surround:x.
Rendering begins evaluating the second line of surround, which references $bodyContent.
Rendering the first line of $bodyContent produces the String annotate:y.
Rendering begins evaluating the second line of $bodyContent, which references $annotated-surround-content.
Rendering $annotated-surround-content produces the String content.
Rendering outputs the third line of surround, producing the String /surround:x.
The final rendered output is surround:x annotate:y content /surround:x. This approach can be generalized by applying such substitutions to all occurrences of $bodyContent that are inside the content of another macro call, each time using a variable name derived from the macro's name to ensure uniqueness. It won't work for recursive macros without something extra to distinguish each nested invocation, however.
Regarding the scope setting, all that does is add a $macro object to the context, which is unique to each macro invocation and can be used as a map. If you set $macro.myVar to something different in each of two nested macro calls, the outer macro's value for it will be unchanged when the inner one finishes. This does not help with the $bodyContent issue, however, because any reference to $macro inside a macro's $bodyContent will be resolved to the innermost macro when it's rendered.
Regarding the final 4 from #recursive( 4 ), that comes from a combination of macro arguments having local scope and being passed by name. For all but the outermost invocation of #recursive, the argument $x is a reference to the global context variable $xMinusOne - when they render the after line, the use of $x is actually resolved to looking up the current value of $xMinusOne in the global context. For the outermost invocation it is instead the constant value 4, and the arguments of the inner invocations go out of scope when they finish, so when the outermost one gets to the final line it's back to being 4.

Starting with the easiest, macro.provide.scope.control=true will definitely create a separate $macro scope object for every macro invocation. Otherwise, as you note, the $macro.parent would be nonsense. The whole point of the "scope controls" is to provide an explicit namespace for the type of VTL block in question. You can even do surround.provide.scope.control=true to automatically create a $surround scope inside of ##surround bodyContent.
On your first question, i'm a little confused as to what's happening. Both the call to ##annotate-surround and the nested call to ##surround will make $bodyContent references available. Am i right that's what happening is that the "wrong" $bodyContent is being used? The $bodyContent reference should belong to the nearest block macro call. To reference the outer macro's $bodyContent within the inner macro, you'll probably need to #set( $macro.bodyContent = $bodyContent ) and then, within the inner, use it via $macro.parent.bodyContent
As for #recursive weirdness, i don't know offhand and have to get on to other work now. It also doesn't help that i don't have Velocity checked out on my present machine, so i can't quickly try things out.

Related

Understanding Raku's `&?BLOCK` compile-time variable

I really appreciate the Raku's &?BLOCK variable – it lets you recurse within an unnamed block, which can be extremely powerful. For example, here's a simple, inline, and anonymous factorial function:
{ when $_ ≤ 1 { 1 };
$_ × &?BLOCK($_ - 1) }(5) # OUTPUT: «120»
However, I have some questions about it when used in more complex situations. Consider this code:
{ say "Part 1:";
my $a = 1;
print ' var one: '; dd $a;
print ' block one: '; dd &?BLOCK ;
{
my $a = 2;
print ' var two: '; dd $a;
print ' outer var: '; dd $OUTER::a;
print ' block two: '; dd &?BLOCK;
print "outer block: "; dd &?OUTER::BLOCK
}
say "\nPart 2:";
print ' block one: '; dd &?BLOCK;
print 'postfix for: '; dd &?BLOCK for (1);
print ' prefix for: '; for (1) { dd &?BLOCK }
};
which yields this output (I've shortened the block IDs):
Part 1:
var one: Int $a = 1
block one: -> ;; $_? is raw = OUTER::<$_> { #`(Block|…6696) ... }
var two: Int $a = 2
outer var: Int $a = 1
block two: -> ;; $_? is raw = OUTER::<$_> { #`(Block|…8496) ... }
outer block: -> ;; $_? is raw = OUTER::<$_> { #`(Block|…8496) ... }
Part 2:
block one: -> ;; $_? is raw = OUTER::<$_> { #`(Block|…6696) ... }
postfix for: -> ;; $_ is raw { #`(Block|…9000) ... }
prefix for: -> ;; $_ is raw { #`(Block|…9360) ... }
Here's what I don't understand about that: why does the &?OUTER::BLOCK refer (based on its ID) to block two rather than block one? Using OUTER with $a correctly causes it to refer to the outer scope, but the same thing doesn't work with &?BLOCK. Is it just not possible to use OUTER with &?BLOCK? If not, is there a way to access the outer block from the inner block? (I know that I can assign &?BLOCK to a named variable in the outer block and then access that variable in the inner block. I view that as a workaround but not a full solution because it sacrifices the ability to refer to unnamed blocks, which is where much of &?BLOCK's power comes from.)
Second, I am very confused by Part 2. I understand why the &?BLOCK that follows the prefix for refers to an inner block. But why does the &?BLOCK that precedes the postfix for also refer to its own block? Is a block implicitly created around the body of the for statement? My understanding is that the postfix forms were useful in large part because they do not require blocks. Is that incorrect?
Finally, why do some of the blocks have OUTER::<$_> in the but others do not? I'm especially confused by Block 2, which is not the outermost block.
Thanks in advance for any help you can offer! (And if any of the code behavior shown above indicates a Rakudo bug, I am happy to write it up as an issue.)
That's some pretty confusing stuff you've encountered. That said, it does all make some kind of sense...
Why does the &?OUTER::BLOCK refer (based on its ID) to block two rather than block one?
Per the doc, &?BLOCK is a "special compile variable", as is the case for all variables that have a ? as their twigil.
As such it's not a symbol that can be looked up at run-time, which is what syntax like $FOO::bar is supposed to be about afaik.
So I think the compiler ought by rights reject use of a "compile variable" with the package lookup syntax. (Though I'm not sure. Does it make sense to do "run-time" lookups in the COMPILING package?)
There may already be a bug filed (in either of the GH repos rakudo/rakudo/issues or raku/old-issues-tracker/issues) about it being erroneous to try to do a run-time lookup of a special compile variable (the ones with a ? twigil). If not, it makes sense to me to file one.
Using OUTER with $a correctly causes it to refer to the outer scope
The symbol associated with the $a variable in the outer block is stored in the stash associated with the outer block. This is what's referenced by OUTER.
Is it just not possible to use OUTER with &?BLOCK?
I reckon not for the reasons given above. Let's see if anyone corrects me.
If not, is there a way to access the outer block from the inner block?
You could pass it as an argument. In other words, close the inner block with }(&?BLOCK); instead of just }. Then you'd have it available as $_ in the inner block.
Why does the &?BLOCK that precedes the postfix for also refer to its own block?
It is surprising until you know why, but...
Is a block implicitly created around the body of the for statement?
Seems so, so the body can take an argument passed by each iteration of the for.
My understanding is that the postfix forms were useful in large part because they do not require blocks.
I've always thought of their benefit as being that they A) avoid a separate lexical scope and B) avoid having to type in the braces.
Is that incorrect?
It seems so. for has to be able to supply a distinct $_ to its statement(s) (you can put a series of statements in parens), so if you don't explicitly write braces, it still has to create a distinct lexical frame, and presumably it was considered better that the &?BLOCK variable tracked that distinct frame with its own $_, and "pretended" that was a "block", and displayed its gist with a {...}, despite there being no explicit {...}.
Why do some of the blocks have OUTER::<$_> in them but others do not?
While for (and given etc) always passes an "it" aka $_ argument to its blocks/statements, other blocks do not have an argument automatically passed to them, but they will accept one if it's manually passed by the writer of code manually passing one.
To support this wonderful idiom in which one can either pass or not pass an argument, blocks other than ones that are automatically fed an $_ are given this default of binding $_ to the outer block's $_.
I'm especially confused by Block 2, which is not the outermost block.
I'm confused by you being especially confused by that. :) If the foregoing hasn't sufficiently cleared this last aspect up for you, please comment on what it is about this last bit that's especially confusing.
During compilation the compiler has to keep track of various things. One of which is the current block that it is compiling.
The block object gets stored in the compiled code wherever it sees the special variable $?BLOCK.
Basically the compile-time variables aren't really variables, but more of a macro.
So whenever it sees $?BLOCK the compiler replaces it with whatever the current block the compiler is currently compiling.
It just happens that $?OUTER::BLOCK is somehow close enough to $?BLOCK that it replaces that too.
I can show you that there really isn't a variable by that name by trying to look it up by name.
{ say ::('&?BLOCK') } # ERROR: No such symbol '&?BLOCK'
Also every pair of {} (that isn't a hash ref or hash index) denotes a new block.
So each of these lines will say something different:
{
say $?BLOCK.WHICH;
say "{ $?BLOCK.WHICH }";
if True { say $?BLOCK.WHICH }
}
That means if you declare a variable inside one of those constructs it is contained to that construct.
"{ my $a = "abc"; say $a }"; # abc
say $a; # COMPILE ERROR: Variable '$a' is not declared
if True { my $b = "def"; say $b } # def
say $b; # COMPILE ERROR: Variable '$b' is not declared
In the case of postfix for, the left side needs to be a lambda/closure so that for can set $_ to the current value.
It was probably just easier to fake it up to be a Block than to create a new Code type just for that use.
Especially since an entire Raku source file is also considered a Block.
A bare Block can have an optional argument.
my &foo;
given 5 {
&foo = { say $_ }
}
foo( ); # 5
foo(42); # 42
If you give it an argument it sets $_ to that value.
If you don't, $_ will point to whatever $_ was outside of that declaration. (Closure)
For many of the uses of that construct, doing that can be very handy.
sub call-it-a (&c){
c()
}
sub call-it-b (&c, $arg){
c( $arg * 10 )
}
for ^5 {
call-it-a( { say $_ } ); # 0␤ 1␤ 2␤ 3␤ 4␤
call-it-b( { say $_ }, $_ ); # 0␤10␤20␤30␤40␤
}
For call-it-a we needed it to be a closure over $_ to work.
For call-it-b we needed it to be an argument instead.
By having :( ;; $_? is raw = OUTER::<$_> ) as the signature it caters to both use-cases.
This makes it easy to create simple lambdas that just do what you want them to do.

Fortran CHARACTER FUNCTION without defined size [duplicate]

I am writing the following simple routine:
program scratch
character*4 :: word
word = 'hell'
print *, concat(word)
end program scratch
function concat(x)
character*(*) x
concat = x // 'plus stuff'
end function concat
The program should be taking the string 'hell' and concatenating to it the string 'plus stuff'. I would like the function to be able to take in any length string (I am planning to use the word 'heaven' as well) and concatenate to it the string 'plus stuff'.
Currently, when I run this on Visual Studio 2012 I get the following error:
Error 1 error #6303: The assignment operation or the binary
expression operation is invalid for the data types of the two
operands. D:\aboufira\Desktop\TEMP\Visual
Studio\test\logicalfunction\scratch.f90 9
This error is for the following line:
concat = x // 'plus stuff'
It is not apparent to me why the two operands are not compatible. I have set them both to be strings. Why will they not concatenate?
High Performance Mark's comment tells you about why the compiler complains: implicit typing.
The result of the function concat is implicitly typed because you haven't declared its type otherwise. Although x // 'plus stuff' is the correct way to concatenate character variables, you're attempting to assign that new character object to a (implictly) real function result.
Which leads to the question: "just how do I declare the function result to be a character?". Answer: much as you would any other character variable:
character(len=length) concat
[note that I use character(len=...) rather than character*.... I'll come on to exactly why later, but I'll also point out that the form character*4 is obsolete according to current Fortran, and may eventually be deleted entirely.]
The tricky part is: what is the length it should be declared as?
When declaring the length of a character function result which we don't know ahead of time there are two1 approaches:
an automatic character object;
a deferred length character object.
In the case of this function, we know that the length of the result is 10 longer than the input. We can declare
character(len=LEN(x)+10) concat
To do this we cannot use the form character*(LEN(x)+10).
In a more general case, deferred length:
character(len=:), allocatable :: concat ! Deferred length, will be defined on allocation
where later
concat = x//'plus stuff' ! Using automatic allocation on intrinsic assignment
Using these forms adds the requirement that the function concat has an explicit interface in the main program. You'll find much about that in other questions and resources. Providing an explicit interface will also remove the problem that, in the main program, concat also implicitly has a real result.
To stress:
program
implicit none
character(len=[something]) concat
print *, concat('hell')
end program
will not work for concat having result of the "length unknown at compile time" forms. Ideally the function will be an internal one, or one accessed from a module.
1 There is a third: assumed length function result. Anyone who wants to know about this could read this separate question. Everyone else should pretend this doesn't exist. Just like the writers of the Fortran standard.

scope of julia variables: re-assigning within a loop in open expression

I am struggling with re-assigning a variable in a loop in Julia. I have a following example:
infile = "test.txt"
feature = "----"
for ln in 1:3
println(feature)
feature = "+"
end
open(infile) do f
#if true
# println(feature)
# feature = "----"
println(feature)
for ln in 1:5 #eachline(f)
println("feature")
#fails here
println(feature)
# because of this line:
feature = "+"
end
end
It fails if I do reassignment within the loop. I see that the issue with the variable scope, and because nested scopes are involved. The reference says that the loops introduce 'soft' scope. I cannot find out from the manual what scope open expression belongs to, but seemingly it screws things up, as if I replace open with if true, things run smoothly.
Do I understand correctly that open introduces 'hard' scope and that it is the reason why re-assignment retroactively makes the variable undefined?
You should think of
open("file") do f
...
end
as
open(function (f)
...
end, "file")
that is, do introduces the same kind of hard scope as a function or -> would.
So to be able to write to feature from the function, you need to do
open(infile) do f
global feature # this means: use the global `feature`
println(feature)
for ln in 1:5
println("feature")
println(feature)
feature = "+"
end
end
Note that this is only the case in top-level (module) scope; once inside a function, there are no hard scopes.
(The for loop in this case is a red herring; regardless of the soft scope of the loop, the access to feature will be limited by the hard scope of the anonymous function introduced by do.)

Megaparsec: macro expansion during parsing

In a small DSL, I'm parsing macro definitions, similarly to #define C pre-processor directives (here a simplistic example):
_def mymacro(a,b) = a + b / a
When the following call is encountered by the parser
c = mymacro(pow(10,2),3)
it is expanded to
c = pow(10,2) + 3 / pow(10,2)
My current approach is:
wrap the parser in a State monad
when parsing macro definitions, store them in the state, with their body unparsed (parse it as a string)
when parsing a macro call, find the definition in the state, replace the arguments in the body text, replace the call with this body and resume the parsing.
Some code from the last step:
macrocallStmt
= do -- capture starting position and content of old input before macro call
oldInput <- getInput
oldPos <- getPosition
-- parse the call
ret <- identifier
symbolCS "="
i <- identifier
args <- parens $ commaSep anyExprStr
-- expand the macro call
us <- get
let inlinedCall = replaceMacroArgs i args ret us
-- set up new input with macro call expanded
remainder <- getInput
let newInput = T.append inlinedCall (T.cons '\n' remainder)
setPosition oldPos
setInput newInput
-- update the expanded input script
modify (updateExpandedInput oldInput newInput)
anyExprStr = fmap praShow expression <|> fmap praShow algexpr
This approach does the job decently. However, it has a number of drawbacks.
Parsing multiple times
Any valid DSL expression can be an argument of the macro call. Therefore, even though I only need their textual representation (to be replaced in the macro body), I need to parse them and then convert them again to string - simply looking for the next comma wouldn't work. Then the complete and customised macro will be parsed. So in practice, macro arguments get parsed twice (and also show-ed, which has its cost). Moreover, each call requires a new parsing of the (almost same) body. The reason to keep the body unparsed in memory is to allow maximum flexibility: in the body, even DSL keywords could be constructed out of the macro arguments.
Error handling
Because the expanded body is inserted in front of the unconsumed input (replacing the call), the initial and final input can be quite different. In the event of a parse error, the position where the error occurred in the expanded input is available. However, when processing the error, I only have the original, not expanded, input. So the error position won't match.
That is why, in the code snippet above, I use the state to save the expanded input, so that it is available when the parser exits with an error.
This works well, but I noticed that it becomes quite costly, with new Text arrays (the input stream is Text) being allocated for the whole stream at every expansion. Perhaps keeping the expanded input in the state as String, rather than Text, would be cheaper in this case, i.e. when a middle part needs to be replaced?
The reasons for this question are:
I would appreciate suggestions / comments on the two issues described above
Can anyone suggest a better approach altogether?

Groovy DSL: creating dynamic closures from Strings

There are some other questions on here that are similar but sufficiently different that I need to pose this as a fresh question:
I have created an empty class, lets call it Test. It doesn't have any properties or methods. I then iterate through a map of key/value pairs, dynamically creating properties named for the key and containing the value... like so:
def langMap = [:]
langMap.put("Zero",0)
langMap.put("One",1)
langMap.put("Two",2)
langMap.put("Three",3)
langMap.put("Four",4)
langMap.put("Five",5)
langMap.put("Six",6)
langMap.put("Seven",7)
langMap.put("Eight",8)
langMap.put("Nine",9)
langMap.each { key,val ->
Test.metaClass."${key}" = val
}
Now I can access these from a new method created like this:
Test.metaClass.twoPlusThree = { return Two + Three }
println test.twoPlusThree()
What I would like to do though, is dynamically load a set of instructions from a String, like "Two + Three", create a method on the fly to evaluate the result, and then iteratively repeat this process for however many strings containing expressions that I happen to have.
Questions:
a) First off, is there simply a better and more elegant way to do this (Based on the info I have given) ?
b) Assuming this path is viable, what is the syntax to dynamically construct this closure from a string, where the string references variable names valid only within a method on this class?
Thanks!
I think the correct answer depends on what you're actually trying to do. Can the input string be a more complicated expression, like '(Two + Six) / Four'?
If you want to allow more complex expressions, you may want to directly evaluate the string as a Groovy expression. Inside the GroovyConsole or a Groovy script, you can directly call evaluate, which will evaluate an expression in the context of that script:
def numNames = 'Zero One Two Three Four Five Six Seven Eight Nine'.split()
// Add each numer name as a property to the script.
numNames.eachWithIndex { name, i ->
this[name] = i
}
println evaluate('(Two + Six) / Four') // -> 2
If you are not in one of those script-friendly worlds, you can use the GroovyShell class:
def numNames = 'Zero One Two Three Four Five Six Seven Eight Nine'.split()
def langMap = [:]
numNames.eachWithIndex { name, i -> langMap[name] = i }
def shell = new GroovyShell(langMap as Binding)
println shell.evaluate('(Two + Six) / Four') // -> 2
But, be aware that using eval is very risky. If the input string is user-generated, i would not recommend you going this way; the user could input something like "rm -rf /".execute(), and, depending on the privileges of the script, erase everything from wherever that script is executed. You may first validate that the input string is "safe" (maybe checking it only contains known operators, whitespaces, parentheses and number names) but i don't know if that's safe enough.
Another alternative is defining your own mini-language for those expressions and then parsing them using something like ANTLR. But, again, this really depends on what you're trying to accomplish.

Resources