I'm messing around with Lua trying to create my own "scripting language".
It's actually just a string that is translated to Lua code, then executed through the use of loadstring. I'm having a problem with my string patterns. When you branch (for example, defining a variable inside of a variable declaration) it errors. For example, the following code would error:
local code = [[
define x as private: function()
define y as private: 5;
end;
]]
--defining y inside of another variable declaration, causes error
This is happening because the pattern to declare a variable first looks for the keyword 'define', and captures everything until a semicolon is found. Therefore, x would be defined as:
function()
define y as private: 5 --found a semicolon, set x to capture
I guess my question is, is it possible to ignore semicolons until the correct one is reached? Here is my code so far:
local lang = {
["define(.-)as(.-):(.-);"] = function(m1, m2, m3)
return (
m2 == "private" and " local " .. m1 .. " = " .. m3 .. " " or
m2 == "global" and " " .. m1 .. " = " .. m3 .. " " or
"ERROR IN DEFINING " .. m1
)
end,
}
function translate(code)
for pattern, replace in pairs(lang) do
code = code:gsub(pattern, replace)
end
return code
end
local code = [[
define y as private: function()
define x as private: 10;
end;
]]
loadstring(translate(code:gsub("%s*", "")))()
--remove the spaces from code, translate it to Lua code through the 'translate' function, then execute it with loadstring
The easiest solution is to to change your last capture group from
(.-) -- 0 or more lazy repetitions
to
(.*) -- 0 or more repetitions
i.e.
pattern = 'define(.-)as(.-):(.*);'
The - modifier according to PiL matches the shortest sequence.
However, as noted in my comment, I wouldn't advise writing a parser for your language using pattern matching. It will either require really complicated patterns (to prevent edge-cases) and probably be unclear to others.
Related
Often code is not as readable as it could be because parameters are always at the end of the function name. Ex.: addDaysToDate(5, myDate).
I thought about a more readable syntax like this:
function add(days)DaysTo(date) {
// Some implementation
}
var myDate = new Date()
add(5)DaysTo(myDate)
And you could go really crazy:
addA(5)('dollar')CouponTo(order)If(user)IsLoggedIn
So here is my question: Are there any languages that incorporate this concept?
Assuming a generous interpretation of the phrase "is there", then: Algol 60 could look like your example. Specifically, it allowed a form of comment in procedure parameters.
add(5) Days To: (myDate);
The specific rule in the grammar that permits this is:
<parameter delimiter> ::= , | ) <letter string> : (
which is to say, the parameters in a procedure statement can be separated by a comma (as is common) or by an arbitrary sequence of letters delimited by ) and :(.
Spaces are everywhere ignored, so they're ok here too.
The letter-string is treated as a comment, so as for all comments, it has no bearing on what the code actually does. This is just as valid as the previous example:
add(5) Bananas To: (myDate);
It seems curious to me now, nearly 45 years after I last used this, that the comment part can only contain letters, no digits.
<letter string> ::= <letter> | <letter string> <letter>
Revised Report on the Algorithmic Language ALGOL 60
Have a look at Pogoscript https://github.com/featurist/pogoscript
There are no keywords in PogoScript. All control structures use the same syntax rules as regular functions and methods, so it's very easy to write your own control structures
Arguments and parameters can be placed anywhere in the name of a function or method call. The careful placement of an argument or a parameter can give it a lot of meaning.
sing (n) bottlesOfBeerOnTheWall =
if (n > 0)
console.log ((n) bottlesOfBeerOnTheWall)
sing (n - 1) bottlesOfBeerOnTheWall
(n) bottlesOfBeerOnTheWall =
"#((n) bottles) of beer on the wall, #((n) bottles) of beer.\n" +
"Take one down, pass it around, #((n - 1) bottles) of beer on the wall."
(n) bottles =
if (n == 0)
"no bottles"
else if (n == 1)
"1 bottle"
else
"#(n) bottles"
sing 99 bottlesOfBeerOnTheWall
in the anltr4 java grammar(https://github.com/antlr/grammars-v4/blob/master/java/Java.g4) I would like to know when I have the complete expression. In this example I am trying to make a transformation similar to the following:
from: String foo = bar + ", " + baz + "; are true";
to: String foo = String.format("{0}, {1}; are true", bar, baz);
The trouble begins in the declaration from the grammar:
expression ('+'|'-') expression"
which is a child of expression as well. Given the example above, the callbacks will look something like the following:
0: exp0:bar, exp1:","
1: exp0:bar ",", exp1:baz
2: exp0:bar "," baz, exp1:"; are true"
I am targeting the line using a #alias btw. So what I am awkwardly saying is - how do I use a listener to be able to grab the entire expression when a rule is expressed recursively in order to transform the entire expression? Or is there another way that I haven't seen yet?
I've been trying to run this function :
let insert_char s c =
let z = String.create(String.length(s)*2 -1) in
for i = 0 to String.length(s) - 1 do
z.[2*i] <- s.[i];
z.[2*i+1] <- c;
done;
z;;
print_string(insert_char("hello", 'x'));;
However the interpreter returns a type error at the last line "type is string * char" and it expected it to be string. I thought my function insert_char created a string. I don't really understand, thanks.
You define your function as a curried function, but you're calling it with a pair of values. You should call it like this:
insert_char "hello" 'x'
OCaml doesn't require parentheses for a function call. When two values are placed next to each other with nothing between, this is a function call.
The syntax of function application in OCaml is f arg1 arg2, not f(arg1, arg2). So it would be print_string (insert_char "hello" 'x').
(Once you fix that you'll discover that your code has other problems, but that is unrelated to your question.)
I have a macro that looks essentially like this:
#macro( surround $x )
surround:$x
$bodyContent
/surround:$x
#end
Invocation ##surround("A")bunch o' stuff#end produces "surround:A bunch o' stuff /surround:A" as
expected. Invocation ##surround("A")##surround("B")more stuff#end#end produces
surround:A surround:B more stuff /surround:B /surround:A which is exactly what I want.
But now I want to build upwards with another macro
#macro( annotated-surround $x $y )
##surround( $x )
annotate:$y
$bodyContent
#end
#end
The intended expansion of #annotated-surround( "C" "note" ) stuff #end is
surround:C annotate:note stuff /surround:C
...but this doesn't work; I get the dreaded semi-infinite expansion of the annotated-surround body
content.
I have read the answer at Closure in Velocity template macros and still don't quite know whether what I want to do is possible.
I'm willing to do arbitrarily tricky things within the definitions of #surround and
#annotated-surround, but I don't want the users of those macros to see any complexity. The
whole idea is to simplify their lives.
As long as I have your ear: Setting macro.provide.scope.control=true is supposed to "a local namespace in macros". What does this mean? Is the provided namespace independent of the default context, but with a single such space shared among all invocations of all macros? Or is a separate context provided for each macro invocation, even recursively? It has to be the latter because of $macro.parent, right?
And yet another question. Consider the following macro:
#macro( recursive $x )
#if($x == 0)
zero
#else
$x before . . .
#set($xMinusOne = $x - 1)
#recursive($xMinusOne)
. . . $x after
#end
#end
#recursive( 4 ) yields:
4 before . . .
3 before . . .
2 before . . .
1 before . . .
zero . . .
0 after . . .
0 after . . .
0 after . . .
4 after
Now I understand all those occurrences of "0": there's only one global $x, so assigning to it on
the recursive calls smashes it and it doesn't get restored. But where on earth does that final "4"
come from? For that matter, how is it that my first "surround" macro works to arbitrary depth;
how come its final $x doesn't get smashed in inner calls?
Sorry to be so prolix, but I have been unable to find clear documentation in this matter.
The problem is the combination of global variables, a name collision, and lazy rendering.
Let's walk through the rendering process for ##annotated-surround( "x" "y" )content#end:
Rendering enters the annotated-surround macro. The context map contains:
$x = String x
$y = String y
$bodyContent = Renderable content - note that the String output of this has not yet been evaluated.
Rendering of the first line enters the surround macro. This updates the context map to:
new $x = old $x = String x
$y = String y
$bodyContent = Renderable annotate:$y\n$bodyContent - note that the String output of this still has not yet been evaluated, it's still template code.
Rendering outputs the first line of surround, producing the String surround:x.
Rendering begins evaluating the second line of surround, which references $bodyContent.
Rendering the first line of $bodyContent produces the String annotate:y.
Rendering begins evaluating the second line of $bodyContent, which references $bodyContent.
Rendering the first line of $bodyContent produces the String annotate:y.
Rendering begins evaluating the second line of $bodyContent, which references $bodyContent.
etc.
The solution is to remove part of the problem's combination. Global variables and lazy rendering are fundamental parts of how Velocity works, so you can't touch those. That leaves the name collision. What you need is for each macro's $bodyContent to be referred to with a different name. This is easily achieved by assigning it to new variables with unique names in each macro before invoking any other macros, and using the new variable in any invoked macro's body, like this:
#macro( surround $x )
surround:$x
$bodyContent
/surround:$x
#end
#macro( annotated-surround $x $y )
#set( $annotated-surround-content = $bodyContent )
##surround( $x )
annotate:$y
$annotated-surround-content
#end
#end
Rendering of this version goes like this:
Rendering enters the annotated-surround macro. The context map contains:
$x = String x
$y = String y
$bodyContent = Renderable content - note that the String output of this has not yet been evaluated.
Rendering of the first line executes the #set directive, adding a variable to the context map: $annotated-surround-content = current $bodyContent = Renderable content.
Rendering of the second line enters the surround macro. This updates the context map to:
new $x = old $x = String x
$y = String y
$annotated-surround-content = old $bodyContent = Renderable content
$bodyContent = Renderable annotate:$y\n$annotated-surround-content
Rendering outputs the first line of surround, producing the String surround:x.
Rendering begins evaluating the second line of surround, which references $bodyContent.
Rendering the first line of $bodyContent produces the String annotate:y.
Rendering begins evaluating the second line of $bodyContent, which references $annotated-surround-content.
Rendering $annotated-surround-content produces the String content.
Rendering outputs the third line of surround, producing the String /surround:x.
The final rendered output is surround:x annotate:y content /surround:x. This approach can be generalized by applying such substitutions to all occurrences of $bodyContent that are inside the content of another macro call, each time using a variable name derived from the macro's name to ensure uniqueness. It won't work for recursive macros without something extra to distinguish each nested invocation, however.
Regarding the scope setting, all that does is add a $macro object to the context, which is unique to each macro invocation and can be used as a map. If you set $macro.myVar to something different in each of two nested macro calls, the outer macro's value for it will be unchanged when the inner one finishes. This does not help with the $bodyContent issue, however, because any reference to $macro inside a macro's $bodyContent will be resolved to the innermost macro when it's rendered.
Regarding the final 4 from #recursive( 4 ), that comes from a combination of macro arguments having local scope and being passed by name. For all but the outermost invocation of #recursive, the argument $x is a reference to the global context variable $xMinusOne - when they render the after line, the use of $x is actually resolved to looking up the current value of $xMinusOne in the global context. For the outermost invocation it is instead the constant value 4, and the arguments of the inner invocations go out of scope when they finish, so when the outermost one gets to the final line it's back to being 4.
Starting with the easiest, macro.provide.scope.control=true will definitely create a separate $macro scope object for every macro invocation. Otherwise, as you note, the $macro.parent would be nonsense. The whole point of the "scope controls" is to provide an explicit namespace for the type of VTL block in question. You can even do surround.provide.scope.control=true to automatically create a $surround scope inside of ##surround bodyContent.
On your first question, i'm a little confused as to what's happening. Both the call to ##annotate-surround and the nested call to ##surround will make $bodyContent references available. Am i right that's what happening is that the "wrong" $bodyContent is being used? The $bodyContent reference should belong to the nearest block macro call. To reference the outer macro's $bodyContent within the inner macro, you'll probably need to #set( $macro.bodyContent = $bodyContent ) and then, within the inner, use it via $macro.parent.bodyContent
As for #recursive weirdness, i don't know offhand and have to get on to other work now. It also doesn't help that i don't have Velocity checked out on my present machine, so i can't quickly try things out.
I was wondering. Are there languages that use only pass-by-reference as their eval strategy?
I don't know what an "eval strategy" is, but Perl subroutine calls are pass-by-reference only.
sub change {
$_[0] = 10;
}
$x = 5;
change($x);
print $x; # prints "10"
change(0); # raises "Modification of a read-only value attempted" error
VB (pre .net), VBA & VBS default to ByRef although it can be overriden when calling/defining the sub or function.
FORTRAN does; well, preceding such concepts as pass-by-reference, one should probably say that it uses pass-by-address; a FORTRAN function like:
INTEGER FUNCTION MULTIPLY_TWO_INTS(A, B)
INTEGER A, B
MULTIPLY_BY_TWO_INTS = A * B
RETURN
will have a C-style prototype of:
extern int MULTIPLY_TWO_INTS(int *A, int *B);
and you could call it via something like:
int result, a = 1, b = 100;
result = MULTIPLY_TWO_INTS(&a, &b);
Another example are languages that do not know function arguments as such but use stacks. An example would be Forth and its derivatives, where a function can change the variable space (stack) in whichever way it wants, modifying existing elements as well as adding/removing elements. "prototype comments" in Forth usually look something like
(argument list -- return value list)
and that means the function takes/processes a certain, not necessarily constant, number of arguments and returns, again, not necessarily a constant, number of elements. I.e. you can have a function that takes a number N as argument and returns N elements - preallocating an array, if you so like.
How about Brainfuck?