How does Perl 6's multi dispatch decide which routine to use? - signature

Consider this program where I construct an Array in the argument list. Although there's a signature that accepts an Array, this calls the one that accepts a List:
foo( [ 1, 2, 3 ] );
multi foo ( Array #array ) { put "Called Array # version" }
multi foo ( Array $array ) { put "Called Array \$ version" }
multi foo ( List $list ) { put "Called List version" }
multi foo ( Range $range ) { put "Called Range version" }
I get the output from an unexpected routine:
Called Array $ version
If I uncomment that other signature, that one is called:
Called List version
Why doesn't it call the ( Array #array ) version? How is the dispatcher making its decision (and where is it documented)?

Why doesn't it call the ( Array #array ) version?
Your test foo call has just an Array ([1,2,3]) as its argument, not an Array of Arrays (eg [[1,2,3],[4,5,6]]).
(The # in #array indicates a value that does Positional, eg an array or a list. Array #array indicates the same thing but with the additional constraint that each element of the array, list or whatever is an Array.)
How is the dispatcher making its decision?
Simplifying, it's picking the narrowest matching type:
multi foo ( Array ) {} # Narrowest
multi foo ( List ) {} # Broader
multi foo ( Positional ) {} # Broader still
multi foo ( #array ) {} # Same as `Positional`
(Diagram of subtype relationships of Array, List and Positional.)
For lots of details see jnthn's authoritative answer to a related SO question.
(and where is it documented)?
I'm not sure about the doc. Multi-dispatch looks pretty minimal.

I made a really dumb mistake, and that's why I wasn't seeing what I expected. You can't constrain a variable that starts with #. Any constraint applies to its elements. The Array #array denotes that I have a positional sort of thing in which each element is an Array. This is the same thing that raiph said. The odd thing is that the grammar looks the same but it does different things. It's something I've tripped over before.
Since it's doing something different, it's not going to work out even if the data structure matches:
foo( [ [1], [2], [3] ] );
foo( [ 1, 2, 3 ] );
multi foo ( Array #array ) { put "Called Array # version" }
multi foo ( Array $array ) { put "Called Array \$ version" }
multi foo ( List $list ) { put "Called List version" }
multi foo ( Range $range ) { put "Called Range version" }
I still get the version I wouldn't expect based on the constraint and the data structure:
Called Array $ version
Called Array $ version
I think this is just going to be one of Perl 6's warts that normal users will have to learn.

There seems to be a trade off between the design documents (more complete but more outdated) and the documentation (known to be incomplete, as docs.perl6.org acknowledges, but hopefully more up-to-date). The former explains multisub resolution in Synopsis 12. Excerpt:
When you call a routine with a particular short name, if there are multiple visible long names, they are all considered candidates. They are sorted into an order according to how close the run-time types of the arguments match up with the declared types of the parameters of each candidate. The best candidate is called, unless there's a tie, in which case the tied candidates are redispatched using any additional tiebreaker strategies (see below). [...]
There are three tiebreaking modes, in increasing order of desperation:
A) inner or derived scope
B) run-time constraint processing
C) use of a candidate marked with "is default"
Tiebreaker A simply prefers candidates in an inner or more derived scope over candidates in an outer or less derived scope. For candidates in the same scope, we proceed to tiebreaker B.
In the absence of any constraints, ties in tiebreaker A immediately failover to tiebreaker C; if not resolved by C, they warn at compile time about an ambiguous dispatch. [...]
I don’t know enough about Perl 6 to testify as to its accuracy, but it seems to be in agreement with raith’s answer, and covers additional ground as well.

Related

Interpolate without creating a String context in Raku?

If I have a variable my $a = True, then I get this output from the following code:
say «a list of words foo $a bar baz».raku;
# OUTPUT: ("a", "list", "of", "words", "foo", "True", "bar", "baz")
That is, even though the result is a List, the element True is stringified before being included in the list – the list contains "True", not True. Is there any way to avoid that stringification while still using interpolation?
Would there be a way to do so if $a were a class I'd defined (and thus can write the Str method for) rather than a Bool?
(I am aware that I can write the more verbose ("a", "list", "of", "words", "foo", $a, "bar", "baz") or «a list of words foo».Slip, $a, «bar baz».Slip, but I'm asking if there is a way to still use interpolation).
Interpolation is putting a thing into a string.
"a b c $thing d e f"
It does that by first turning the thing itself into a string, and concatenating the rest of the string around it.
Basically the above compiles into this code:
infix:<~>( 「a b c 」, $thing.Str, 「 d e f」 )
« a b c $thing »
Is short for:
Q :double :quotewords « a b c $thing d e f »
That is use the Quoting DSL, turning on :double quote semantics (“”) and turning on :quotewords.
:quotewords is the feature which splits the string into its individual parts.
It happens only after it has been turned into a string.
Imagine that the above compiles into:
Internals::quotewords( infix:<~>( 「 a b c 」, $thing.Str, 「 d e f 」 ) )
There is another way to get what you want, other than using .Slip or prefix |.
flat «a list of words foo», $a, «bar baz»
The whole purpose of the quoting DSL is that it produces a string.
That said :words, :quotewords, and :val all change it so that it returns something other than a single string.
And the idea of them is that they alter the DSL.
So MAYBE you could convince enough people that such a change would be worth it.
Thats a big maybe.
It would potentially break many existing codebases, so you would have an uphill battle to do so.
What happens here has little to do with quoting, and a lot to do with context. As #brad-gilbert has indicated, anything that goes passes through putting ~ in front, which is coercing the variable to a String context.
But that yields an answer to your second question:
Would there be a way to do so if $a were a class I'd defined (and thus can write the Str method for) rather than a Bool?
Theoretically, something like this should work:
class A {
has Bool $.foo;
method Str { $.foo }
};
my $a = A.new( :foo(True) );
say «a b $a».raku
Alas, this returns «No such method 'WORDS_AUTODEREF' for invocant of type 'Bool'␤ so it probably needs a bit of work (or I might have bumped into some bug). So this is, for the time being, and for your precise example, a nanswer. As a matter of fact, only Strs have that method, so I think that for the time being, and unless you bother to create that specialized method for a class, it's difficult to do.

Understanding Raku's `&?BLOCK` compile-time variable

I really appreciate the Raku's &?BLOCK variable – it lets you recurse within an unnamed block, which can be extremely powerful. For example, here's a simple, inline, and anonymous factorial function:
{ when $_ ≤ 1 { 1 };
$_ × &?BLOCK($_ - 1) }(5) # OUTPUT: «120»
However, I have some questions about it when used in more complex situations. Consider this code:
{ say "Part 1:";
my $a = 1;
print ' var one: '; dd $a;
print ' block one: '; dd &?BLOCK ;
{
my $a = 2;
print ' var two: '; dd $a;
print ' outer var: '; dd $OUTER::a;
print ' block two: '; dd &?BLOCK;
print "outer block: "; dd &?OUTER::BLOCK
}
say "\nPart 2:";
print ' block one: '; dd &?BLOCK;
print 'postfix for: '; dd &?BLOCK for (1);
print ' prefix for: '; for (1) { dd &?BLOCK }
};
which yields this output (I've shortened the block IDs):
Part 1:
var one: Int $a = 1
block one: -> ;; $_? is raw = OUTER::<$_> { #`(Block|…6696) ... }
var two: Int $a = 2
outer var: Int $a = 1
block two: -> ;; $_? is raw = OUTER::<$_> { #`(Block|…8496) ... }
outer block: -> ;; $_? is raw = OUTER::<$_> { #`(Block|…8496) ... }
Part 2:
block one: -> ;; $_? is raw = OUTER::<$_> { #`(Block|…6696) ... }
postfix for: -> ;; $_ is raw { #`(Block|…9000) ... }
prefix for: -> ;; $_ is raw { #`(Block|…9360) ... }
Here's what I don't understand about that: why does the &?OUTER::BLOCK refer (based on its ID) to block two rather than block one? Using OUTER with $a correctly causes it to refer to the outer scope, but the same thing doesn't work with &?BLOCK. Is it just not possible to use OUTER with &?BLOCK? If not, is there a way to access the outer block from the inner block? (I know that I can assign &?BLOCK to a named variable in the outer block and then access that variable in the inner block. I view that as a workaround but not a full solution because it sacrifices the ability to refer to unnamed blocks, which is where much of &?BLOCK's power comes from.)
Second, I am very confused by Part 2. I understand why the &?BLOCK that follows the prefix for refers to an inner block. But why does the &?BLOCK that precedes the postfix for also refer to its own block? Is a block implicitly created around the body of the for statement? My understanding is that the postfix forms were useful in large part because they do not require blocks. Is that incorrect?
Finally, why do some of the blocks have OUTER::<$_> in the but others do not? I'm especially confused by Block 2, which is not the outermost block.
Thanks in advance for any help you can offer! (And if any of the code behavior shown above indicates a Rakudo bug, I am happy to write it up as an issue.)
That's some pretty confusing stuff you've encountered. That said, it does all make some kind of sense...
Why does the &?OUTER::BLOCK refer (based on its ID) to block two rather than block one?
Per the doc, &?BLOCK is a "special compile variable", as is the case for all variables that have a ? as their twigil.
As such it's not a symbol that can be looked up at run-time, which is what syntax like $FOO::bar is supposed to be about afaik.
So I think the compiler ought by rights reject use of a "compile variable" with the package lookup syntax. (Though I'm not sure. Does it make sense to do "run-time" lookups in the COMPILING package?)
There may already be a bug filed (in either of the GH repos rakudo/rakudo/issues or raku/old-issues-tracker/issues) about it being erroneous to try to do a run-time lookup of a special compile variable (the ones with a ? twigil). If not, it makes sense to me to file one.
Using OUTER with $a correctly causes it to refer to the outer scope
The symbol associated with the $a variable in the outer block is stored in the stash associated with the outer block. This is what's referenced by OUTER.
Is it just not possible to use OUTER with &?BLOCK?
I reckon not for the reasons given above. Let's see if anyone corrects me.
If not, is there a way to access the outer block from the inner block?
You could pass it as an argument. In other words, close the inner block with }(&?BLOCK); instead of just }. Then you'd have it available as $_ in the inner block.
Why does the &?BLOCK that precedes the postfix for also refer to its own block?
It is surprising until you know why, but...
Is a block implicitly created around the body of the for statement?
Seems so, so the body can take an argument passed by each iteration of the for.
My understanding is that the postfix forms were useful in large part because they do not require blocks.
I've always thought of their benefit as being that they A) avoid a separate lexical scope and B) avoid having to type in the braces.
Is that incorrect?
It seems so. for has to be able to supply a distinct $_ to its statement(s) (you can put a series of statements in parens), so if you don't explicitly write braces, it still has to create a distinct lexical frame, and presumably it was considered better that the &?BLOCK variable tracked that distinct frame with its own $_, and "pretended" that was a "block", and displayed its gist with a {...}, despite there being no explicit {...}.
Why do some of the blocks have OUTER::<$_> in them but others do not?
While for (and given etc) always passes an "it" aka $_ argument to its blocks/statements, other blocks do not have an argument automatically passed to them, but they will accept one if it's manually passed by the writer of code manually passing one.
To support this wonderful idiom in which one can either pass or not pass an argument, blocks other than ones that are automatically fed an $_ are given this default of binding $_ to the outer block's $_.
I'm especially confused by Block 2, which is not the outermost block.
I'm confused by you being especially confused by that. :) If the foregoing hasn't sufficiently cleared this last aspect up for you, please comment on what it is about this last bit that's especially confusing.
During compilation the compiler has to keep track of various things. One of which is the current block that it is compiling.
The block object gets stored in the compiled code wherever it sees the special variable $?BLOCK.
Basically the compile-time variables aren't really variables, but more of a macro.
So whenever it sees $?BLOCK the compiler replaces it with whatever the current block the compiler is currently compiling.
It just happens that $?OUTER::BLOCK is somehow close enough to $?BLOCK that it replaces that too.
I can show you that there really isn't a variable by that name by trying to look it up by name.
{ say ::('&?BLOCK') } # ERROR: No such symbol '&?BLOCK'
Also every pair of {} (that isn't a hash ref or hash index) denotes a new block.
So each of these lines will say something different:
{
say $?BLOCK.WHICH;
say "{ $?BLOCK.WHICH }";
if True { say $?BLOCK.WHICH }
}
That means if you declare a variable inside one of those constructs it is contained to that construct.
"{ my $a = "abc"; say $a }"; # abc
say $a; # COMPILE ERROR: Variable '$a' is not declared
if True { my $b = "def"; say $b } # def
say $b; # COMPILE ERROR: Variable '$b' is not declared
In the case of postfix for, the left side needs to be a lambda/closure so that for can set $_ to the current value.
It was probably just easier to fake it up to be a Block than to create a new Code type just for that use.
Especially since an entire Raku source file is also considered a Block.
A bare Block can have an optional argument.
my &foo;
given 5 {
&foo = { say $_ }
}
foo( ); # 5
foo(42); # 42
If you give it an argument it sets $_ to that value.
If you don't, $_ will point to whatever $_ was outside of that declaration. (Closure)
For many of the uses of that construct, doing that can be very handy.
sub call-it-a (&c){
c()
}
sub call-it-b (&c, $arg){
c( $arg * 10 )
}
for ^5 {
call-it-a( { say $_ } ); # 0␤ 1␤ 2␤ 3␤ 4␤
call-it-b( { say $_ }, $_ ); # 0␤10␤20␤30␤40␤
}
For call-it-a we needed it to be a closure over $_ to work.
For call-it-b we needed it to be an argument instead.
By having :( ;; $_? is raw = OUTER::<$_> ) as the signature it caters to both use-cases.
This makes it easy to create simple lambdas that just do what you want them to do.

Unexpected Hash flattening

I'm looking for explanation why those two data structures are not equal:
$ perl6 -e 'use Test; is-deeply [ { a => "b" } ], [ { a => "b" }, ];'
not ok 1 -
# Failed test at -e line 1
# expected: $[{:a("b")},]
# got: $[:a("b")]
Trailing comma in Hashes and Arrays is meaningless just like in P5:
$ perl6 -e '[ 1 ].elems.say; [ 1, ].elems.say'
1
1
But without it Hash is somehow lost and it gets flattened to array of Pairs:
$ perl6 -e '[ { a => "b", c => "d" } ].elems.say;'
2
I suspect some Great List Refactor laws apply here but I'd like to get more detailed explanation to understand logic behind this flattening.
Trailing comma in Hashes and Arrays is meaningless just like in P5
No, it's not meaningless:
(1 ).WHAT.say ; # (Int)
(1,).WHAT.say ; # (List)
The big simplification in the Great List Refactor was switching to the single argument rule for iterating features1. That is to say, features like a for or the array and hash composers (and subscripts) always get a single argument. That is indeed what's going on with your original example.
The single argument may be -- often will be -- a list of values, possibly even a list of lists etc., but the top level list would still then be a single argument to the iterating feature.
If the single argument to an iterating feature does the Iterable role (for example lists, arrays, and hashes), then it's iterated. (This is an imprecise formulation; see my answer to "When does for call the iterator method?" for a more precise one.)
So the key thing to note here about that extra comma is that if the single argument does not do the Iterable role, such as 1, then the end result is exactly the same as if the argument were instead a list containing just that one value (i.e. 1,):
.perl.say for {:a("b")} ; # :a("b") Iterable Hash was iterated
.perl.say for {:a("b")} , ; # {:a("b")} Iterable List was iterated
.perl.say for 1 ; # 1 Non Iterable 1 left as is
.perl.say for 1 , ; # 1 Iterable List was iterated
The typical way "to preserve structure [other than] using trailing comma when single element list is declared" (see comment below), i.e. to
stop a single Iterable value being iterated as it normally would, is by item-izing it with a $:
my #t = [ $[ $[ "a" ] ] ];
#t.push: "b";
#t.perl.say; # [[["a"],], "b"]
1 The iteration is used to get values to be passed to some code in the case of a for; to get values to become elements of the array/hash being constructed in the case of a composer; to get an indexing slice in the case of a subscript; and so on for other iterating features.

Does an anonymous parameter in a Perl 6 signature discard the value?

In the Perl 6 Signature docs, there's an example of an anonymous slurpy parameter:
sub one-arg (#) { }
sub slurpy (*#) { }
one-arg (5, 6, 7); # ok, same as one-arg((5, 6, 7))
slurpy (5, 6, 7); # ok
slurpy 5, 6, 7 ; # ok
There are no statements in the subroutine, mostly because the text around this is about the parameter list satisfying the signature rather than what the subroutine does with it.
I was playing with that and trying to make a subroutine that takes a list of one of more items (so, not zero items). I didn't particularly care to name them. I figured I'd still have access to the argument list in #_ even with the signature. However, you get #_ when you don't have a signature:
$ perl6
To exit type 'exit' or '^D'
> sub slurpy(*#) { say #_ }
===SORRY!=== Error while compiling:
Placeholder variable '#_' cannot override existing signature
------> sub⏏ slurpy(*#) { say #_ }
Is there another way to get the argument list, or does the anonymous parameter discard them? I see them used in the section on type constraints, but there isn't an example that uses any of the parameter values. Can I still get the argument list?
The values aren't discarded; you can for example access it through nextsame:
multi access-anon(*#) is default {
say "in first candidate";
nextsame;
}
multi access-anon(*#a) {
#a;
}
say access-anon('foo')
Output:
in first candidate
[foo]
But to get to your original objective (an array with at least one element), you don't actually need to access the list; you can use a sub-signature:
sub at-least-one(# [$, *#]) { }
at-least-one([1]); # no error
at-least-one([]); # Too few positionals passed; expected at least 1 argument but got only 0 in sub-signature
Does an anonymous parameter in a Perl 6 signature discard the value?
Yes, unless you capture an argument using some named parameter in the function's signature, it won't be available in the function's body. (PS: They're not literally discarded though, as moritz's answer shows.)
I figured I'd still have access to the argument list in #_ even with the signature.
#_ is not an alternative to using parameters - it is a parameter.
Every function has a well-defined signature that specifies its parameters, and they represent the only mechanism for the function body to get at the values the function is passed.
It's just that there are three different ways to declare a function signature:
If you explicitly write out a signature, that's what is used.
If you use placeholder parameters (#_ or $^foo etc.) in the body of the function, the compiler uses that information to build the signature for you: say { $^a + #_ }.signature; # ($a, *#_)
If neither of the above is the case, then the signature becomes:
In case of subroutines, one that accepts zero arguments: say (sub { 42 }).signature; # ()
In case of bare blocks, one that accepts zero or one argument available as $_: say { 42 }.signature; # (;; $_? is raw)
In all cases, the function ends up with an unambiguous signature at compile time. (Trying to use $^foo in a function that already has an explicit signature, is a compile-time error.)
Is there another way to get the argument list
Make sure to capture it with a non-anonymouys parameter. If you want it to be accessible as #_, then call it that in the explicit signature you're writing.
I was [...] trying to make a subroutine that takes a list of one of more items
You can use a sub-signature for that:
sub foo (*#_ [$, *#]) { ... };
Or alternatively a where constraint:
sub foo (*#_ where .so) { ... };
To have #_ populated, the signature either has to be implicit or explicit, not both.
sub implicit { say #_ } # most like Perl 5's behaviour
sub explicit ( *#_ ) { say #_ }
sub placeholder { $^a; say #_ }
This applies to blocks as well
my &implicit = { say #_ }
my &explicit = -> *#_ { say #_ }
my &placeholder = { $^a, say #_ }
Blocks can also have an implicit parameter of $_, but #_ takes precedence if it is there.
{ say $_ }(5) # 5
$_ = 4;
{ #_; say $_ }(5) # 4
It makes sense to do it this way because one programmer may think it works the way you think it does, or that it is slurpy like it would be if implicit, and another may think it gets all of the remaining arguments.
sub identical ( #_ ) { say #_ }
sub slurpy ( *#_ ( #, *# ) ) { say #_ } # same as implicit
sub slurpy2 ( **#_ ( #, *# ) ) { say #_ } # non-flattening
sub remaining ( #, *#_ ) { say #_ }
identical [1,2]; # [1 2]
slurpy $[1,2],3,4,5; # [[1 2] 3 4 5]
slurpy2 [1,2],3,4,5; # [[1 2] 3 4 5]
remaining [1,2],3,4,5; # [3 4 5]
#_ may also be added as a mistake, and in that case it would be preferable for it to produce an error.
There is no way to get at the raw arguments without declaring a capture parameter.
sub raw-one ( |capture ( # ) ) { capture.perl }
sub raw-slurpy ( |capture, *# ) { capture.perl }
raw-one [1,2]; # \([1, 2])
raw-slurpy 1,2 ; # \(1, 2)

What is closure and why to use it?

What is closure in groovy?
Why we use this closure?
Are you asking about Closure annotation parameters?
[...
An interesting feature of annotations in Groovy is that you can use a closure as an annotation value. Therefore annotations may be used with a wide variety of expressions and still have IDE support. For example, imagine a framework where you want to execute some methods based on environmental constraints like the JDK version or the OS. One could write the following code:
class Tasks {
Set result = []
void alwaysExecuted() {
result << 1
}
#OnlyIf({ jdk>=6 })
void supportedOnlyInJDK6() {
result << 'JDK 6'
}
#OnlyIf({ jdk>=7 && windows })
void requiresJDK7AndWindows() {
result << 'JDK 7 Windows'
}
}
...]
Source:http://docs.groovy-lang.org/
Closures are a powerful concept with which you can implement a variety of things and which enable specifying DSLs. They are sort of like Java ( lambdas, but more powerful and versatile. You dont need to use closures, but they can make many things easier.
Since you didnt really specify a concrete question, I'll just point you to the startegy pattern example in the groovy docs:
http://docs.groovy-lang.org/latest/html/documentation/#_strategy_pattern
Think of the closure as an executable unit on its own, like a method or function, except that you can pass it around like a variable, but can do a lot of things that you would normally do with a class, for example.
An example: You have a list of numbers and you either want to add +1 to each number, or you want to double each number, so you say
def nums = [1,2,3,4,5]
def plusone = { item ->
item + 1
}
def doubler = { item ->
item * 2
}
println nums.collect(plusone)
println nums.collect(doubler)
This will print out
[2, 3, 4, 5, 6]
[2, 4, 6, 8, 10]
So what you achieved is that you separated the function, the 'what to do' from the object that you did it on. Your closures separate an action that can be passed around and used by other methods, that are compatible with the closure's input and output.
What we did in the example is that we had a list of numbers and we passed each of them to a closure that did something with it. Either added +1 or doubled the value, and collected them into another list.
And this logic opens up a whole lot of possibilities to solve problems smarter, cleaner, and write code that represents the problem better.

Resources